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INTRODUCTION 
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In  the  initial  stage  of  development  of  a weapon 
system  it  is  impossible  to  know  with  complete  cert_ainty  what 
the  final  outcome  of  the  weapon  system  will  be  in  terms  of 
completion  time,  cost,  and  performance.  Conversely, 
decision  makers  and  technical  experts  are  not  completely 
ignorant  of  the  possible  outcomes.  Therefore,  a language 
is  needed  which  expresses  the  degree  of  belief  that  certain 
outcomes  will  occur.  Subjective  probability  is  such  a 
language . 

This  chapter  presents  the  statement  of  the  research 
problem,  the  justification  for  this  research,  the  scope  and 
objective  of  the  research,  and  the  research  question. 


STATEMENT  OF  THE  PROBLEM 


The  problem  is  that  a methodology  does  not  exist 
for  assessing  the  magnitude  of  uncertainties  in  the  weapons 
acquisition  process.  The  uncertainties  surrounding  the 
weapons  acquisition  process  have  had  their  most  telling 
effect  on  the  ability  of  military  planners  to  estimate  the 
cost  of  weapon  systems,  the  time  required  for  system 
development,  and  the  capability  of  achieving  specified 
performance  characteristics.  Uncertainties  in  the 
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parameters  of  schedule  and  performance  affect  the  area  of 
costs  most  dramatically  (61 s2).  The  significance  of  these 
weapon  system  uncertainties  is  apparent  when  major  weapon 
system  costs  are  studied. 

A research  study  by  Peck  and  Scherer  of  twelve 
programs  of  the  late  1950's  revealed  an  average  weapon 
system  development  cost  growth  of  220  percent  when  compared 
to  estimated  costs  (70:429).  A later  study  by  Marschak  of 
22  weapon  system  development  programs  in  the  1960's  showed 
an  average  cost  growth  over  estimated  costs  of  226  percent 
(5«2).  A General  Accounting  Office  study  in  1972  of  77 
major  weapon  systems  revealed  that  new  cost  estimates  were 
running  28.7  billion  dollars  above  the  original  cost 
estimates , representing  an  average  expense  increase  of  31 
percent  from  original  cost  estimates  (38:2). 

R.  L.  Perry  compared  weapon  development  programs  of 
the  1950's  to  those  of  the  1960's,  and  demonstrated  that 
cost  estimates  for  the  1960's  were  about  25  percent  less 
optimistic  than  those  for  the  1950's  (100:1-2).  A follow-up 
study  by  Harman,  using  improved  data  over  Perry's  study, 
produced  different  results.  Harman  concluded  there  was  no 
indication  of  a significant  difference  between  the  1950's 
and  the  1960's  in  the  ability  of  the  weapon  system 
acquisition  process  to  estimate  costs  or  to  avoid  actual 
cost  overruns  (40:38).  Thus,  there  was  no  improvement 
between  the  1950 's  and  the  1960's  in  the  capability  of  cost 
estimation  techniques  to  reduce  the  uncertainty  in  the 


weapon  system  acquisition  process  with  better  cost 
information. 


JUSTIFICATION 


Uncertainty  of  Future  Costs 

The  following  comments , made  by  the  Blue  Ribbon 
Defense  Panel,  indicated  the  types  of  problems  encountered 
when  cost  estimates  were  used  without  regard  to  the 
uncertainties  surrounding  the  estimate j 

Cost  estimating  for  development  programs  has 
apparently  been  too  widely  credited  in  the  Defense 
Department,  in  industry,  in  the  Congress,  and  by 
the  Public  with  a potential  for  accurate  prediction 
which  is  belied  by  the  inherent  technical  uncertain- 
ties in  developments.  The  precise  problems  which 
may  be  encountered  in  the  process  of  attempting  to 
convert  a technological  or  scientific  theory  or 
experiment  into  practical,  producible  application 
cannot  be  foreseen  with  accuracy.  It  should  be 
axiomatic  that  one  cannot  place  a price  on  an  unknown; 
yet  . . . the  use  of  precontractual  cost  estimates 
as  a firm  baseline  for  measuring  performance  through- 
out the  life  of  the  system,  and  the  shock  reaction 
which  is  forthcoming  when  cost  overruns  are  experi- 
enced, all  evidence  an  unwarranted  degree  of  con- 
fidence in  cost  estimates  [32«83]. 

The  critical  factor  determining  the  accuracy  of  a 
cost  estimate  is  the  degree  of  uncertainty  present  at  the 
time  the  cost  estimate  is  made.  Summers,  in  a study  of 
cost  estimates  as  predictors  of  actual  weapon  system  cost, 
stated  that  "considerations  of  uncertainty  go  a long  way 
toward  explaining  the  differences  in  accuracy  of  cost 
estimates  [84:7]*" 

At  each  stage  of  a major  weapon  development  and 
production  program,  from  the  conceptual  to  the  deployment 


of  the  finished  weapon  system,  an  estimate  of  cost  is  a 
major  input  in  management  decision  making  (35*1 53 ; 100:31). 
Prior  to  the  actual  award  of  the  contract,  the  program 
manager  is  primarily  concerned  with  influencing  the  future 
cost  growth  of  his  program.  Four  types  of  cost  estimates 
are  available  to  aid  the  program  manager  in  this  concern: 
Cost  Analysis  Improvement  Group  (CAIG)  estimates;  Independ- 
ent Cost  Estimates  (ICE);  "in  house"  Systems  Program  Office 
estimates;  and  contractor  proposal  estimates  (62:15)* 

• After  contract  award,  the  program  manager  must 
monitor  control  systems  in  order  to  preclude  program  cost 
growth  (62:15-16).  In  making  his  decisions,  the  program 
manager  must  weigh  the  prospective  value  of  the  end  product 
and  its  prospective  date  of  completion  against  an  estimate 
of  its  cost.  It  is  under  conditions  of  uncertainty  that 
the  program  manager  must  evaluate  and  select  among  alter- 
native proposals  for  future  courses  of  action,  and  attempt 
to  control  program  cost  growth. 

J.  Ronald  Fox,  a former  Assistant  Secretary  of  the 
Army  for  Installations  and  Logistics,  stated  that: 

All  along  the  road,  from  idea  to  systems-inplace , 
choices  must  be  made  that  will  significantly  affect 
the  ultimate  costs  of  the  acquisition  process.  If 
the  services  are  to  achieve  sound  financial  control 
of  these  costs,  it  is  essential  that  they  have  sound 
and  reliable  estimates  of  the  cost  implications  of 
their  choices  [35«15^]* 

It  is  not  clear,  however,  that  decision  makers 
have  accepted  the  fact  that  knowledge  of  the  uncertainties 
associated  with  predictions  of  future  costs  is  as  vital  as 
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the  estimate  itself  (100:31).  To  overcome  the  problems 
pointed  out  by  the  Blue  Ribbon  Defense  Panel,  decision 
makers  must  also  obtain  information  regarding  the  uncer- 
tainties surrounding  cost  estimates.  If  the  cost  estimate 
involves  an  advanced  weapon  system,  it  is  not  sufficient  to 
ask  whether  the  cost  estimate  is  uncertain.  As  Dienemann 
puts  it: 

It  is  an  inescapable  fact  that  estimates  of 
resource  requirements  for  future  systems  are  beset 
with  uncertainty.  The  question  is  not  whether 
uncertainty  exists , but  rather  in  determining  the 
magnitude  and  nature  of  the  uncertainty  [29 :l]. 

This  uncertainty  can  be  reduced  through  an  analysis 
of  the  risks  associated  with  the  weapon  system  acquisition, 
where  risk  is  considered  as: 

. . . the  probability  that  a planned  event  will 
not  be  attained  within  the  prescribed  constraints 
as  defined  by  the  cost,  schedule,  and  performance 
criteria  by  following  a specified  course  of  action 
[58:6]. 

Dr.  Robert  Seamans,  then  Secretary  of  the  Air 
Force,  emphasized  the  need  for  risk  analysis  in  a speech 
to  the  Air  Force  Association  in  September,  1969: 

Still  another  significant  reason  for  cost  growth 
in  the  last  few  years  has  been  the  failure  to  ade- 
quately identify  the  risks  associated  with  major 
programs.  This  should  occur  early  in  the  project 
definition  phase.  Late  recognition  of  significant 
uncertainties  can  be  disastrously  expensive.  In  the 
future,  we  will  make  a formal  risk  analysis  of  each 
of  our  programs.  We  must  guard  against  the  combina- 
tion of  optimistic  pressures,  including  our  own  eager- 
ness to  get  on  with  the  job  [58:2]. 

Former  Deputy  Secretary  of  Defense  Packard  further 
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stated  in  a letter  to  the  Secretaries  of  the  Armed  Services: 
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I would  like  each  of  you  to  focus  more  attention 
on  this  matter  and  assure  that  during  concept  formu-  ; 

lations,  areas  of  high  technical  risk  are  identified 
and  fully  considered;  formal  risk  analysis  on  each 
program  is  made;  and  summaries  on  these  are  made  part 
of  the  backup  material  for  the  program  [58 : 2]. 

Decision  makers  would  be  aided  by  explicit  informa- 
tion describing  the  magnitude  and  nature  of  the  uncertainty 
of  cost  estimates  in  two  ways.  First,  the  event  and  proba- 
bility that  the  ultimate  system  cost  could  differ  from  the 
expected  cost  estimate  can  be  anticipated,  evaluated,  and 
used  as  a control  mechanism  for  system  cost  growth.  Second, 
with  a quantitative  measure  for  the  precision  of  cost  esti- 
mates, decision  makers  would  be  better  able  to  judge — based 
upon  their  preferences  and  attitudes  toward  risk — alterna- 
tive courses  of  action  (29:2). 

The  essence  of  the  preceding  discussion  of  the 
uncertainty  of  cost  estimates  is  that  future  final  costs 
may  assume  a range  of  values.  The  relative  likelihood  of 
occurrence  of  a value  within  this  range  is  expressed  through 
probability.  The  probability  density  function  can  be  either 
discrete  or  continuous.  A discrete  probability  density 
function  assigns  a probability  to  each  number  or  estimate 
within  the  range  of  the  final  cost  variable.  A continuous 
probability  density  function  can  be  thought  of  as  a curve, 
where  the  height  of  the  curve  indicates  the  relative  prob- 
ability or  likelihood  that  the  event  described  by  the  hori- 
zontal axis  (cost)  will  occur.  The  discrete  probability 
density  function  has  a finite  number  of  possible  events, 

while  in  the  continuous  probability  density  function  the  j 
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entire  range  of  possible  values  of  the  variable  is  repre- 
sented. The  variance  is  a measure  of  the  dispersion  of 
possible  final  costs  around  the  expected  final  cost.  The 
variance  is  an  indicator  of  the  uncertainty  involved  in 
final  costs.  The  greater  the  variance,  the  greater  the 
uncertainty  of  the  estimate. 

Probability 

There  is  a substantial  disagreement  among  decision 
theorists  and  statisticians  as  to  the  essential  meaning  of 
probability.  This  controversy  is  centered  around  two  dis- 
tinct points  of  view,  the  objectivist  and  the  subjectivist. 
The  objectivist  point  of  view  relates  probability  to  a 
frequency  of  occurrence  (30: 51).  Probability,  according  to 
this  viewpoint,  is  a statistic  estimated  from  repeated 
observations  of  some  directly  observable  phenomena  (82:xii). 
According  to  the  subjectivist  viewpoint,  the  probability 
of  an  event  is  the  degree  of  belief  or  degree  of  confidence 
placed  in  the  occurrence  of  an  event  by  a particular  indi- 
vidual, based  upon  that  individual's  experience  (3*17) - 

The  objectivists  believe  that,  on  the  basis  of  a 
given  body  of  evidence,  there  is  only  a single  value  for 
the  probability  of  the  truth  of  a statement.  The  subjec- 
tivists hold  that  such  a probability  need  not  be  unique  and 
it  is  not  uniquely  defined.  The  probability  may  take  on 
any  numerical  value  between  0 and  1 corresponding  to  the 
level  of  belief  of  the  estimator  (30:51)- 


Probability  from  the  objectivist  point  of  view  has 
proven  highly  successful  in  many  applications  which  rest 
upon  the  existence  of  a stable,  physical  process  of  which 
repeated  observations  can  be  made  (82:xii).  However,  many 
real  world  decision  makers  face  the  necessity  of  taking 
definite  action  in  the  face  of  substantial  uncertainty 
regarding  the  outcomes  of  nonrepetitive  phenomena.'  In 
such  situations  where  no  data  or  very  little  data  exists, 
subjective  probability  provides  the  only  alternative  to 
quantifying  the  uncertainty  in  cost  estimates  (3: 19; 

82  sxiii) . 

The  process  by  which  military  cost  estimates  are 
generated  does  not  constitute  stable , repetitive  phenomena 
to  which  the  objectivist  probability  viewpoint  is  an 
applicable  concept.  Therefore,  the  assumption  that  subjec- 
tive probability  provides  the  best  means  of  quantifying  the 
uncertainty  in  cost  estimates  was  advanced  for  the  proposed 
research. 

In  an  examination  of  the  methodology  of  risk 
assessment  within  the  U.  S.  Air  Force  Aeronautical  Systems 
Division  (ASD) , Williams  concluded  that: 

. . . the  method  best  suited  for  quantifying 
uncertainty  would  be  subjective  probability  distri- 
butions ....  This  was,  however,  the  most  infre- 
quently used  and  least  widely  known  method  of  those 
surveyed  [100:26], 

In  another  paper  devoted  to  the  subject  of  risk 
analysis  for  the  materiel  acquisition  process,  Hwang 
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concluded  that  "techniques  to  collect  subjective  judgements 
must  be  developed  further  [52:62]," 

The  final  report  of  the  USAF  Academy  Risk  Analysis 
Study  Team,  sponsored  by  ASD,  concluded  that  in  the  area  of 
quantitative  risk  assessment,  aggregation  output  techniques 
(such  as  network  analysis)  are  far  more  advanced  than  the 
techniques  for  obtaining  input  data  (such  as  subjective 
probabilities).  The  study  team  recommended  that  "funding 
priority  for  improving  methods  for  quantitative  risk  assess- 
ment should  be  given  to  the  development  of  input  techniques 
[60:8]." 

This  research  effort  should  contribute  to  the 
development  of  subjective  probability  assessment  techniques 
to  assess  the  magnitude  of  uncertainties  present  during  the 
weapon  system  acquisition  process  and  to  the  integration  of 
these  techniques  into  the  weapon  system  acquisition  process. 

SCOPE 

The  research  will  evaluate  subjective  probability 
assessment  techniques,  and  will  provide  a technique  for  use 
in  quantifying  the  magnitude  of  uncertainty  in  the  weapon 
system  acquisition  process. 

OBJECTIVE 

The  research  objective  is  to  evaluate  existing 
subjective  probability  assessment  techniques,  in  order  to 
propose  an  approach  which  will  best  assess  the  magnitude 


10 

of  uncertainty  which  exists  relative  to  a given  weapon 
system's  development. 

RESEARCH  QUESTION 

What  existing  subjective  probability  assessment 
technique  would  best  assess  the  magnitude  of  uncertainty 
in  a given  weapon  system's  development  effort? 


Chapter  2 
BACKGROUND 
OVERVIEW 

This  chapter  first  presents  classifications  of 
weapon  system  program  uncertainties  from  the  standpoint  of 
both  the  Department  of  Defense  (DOD)  and  the  contractor. 
The  value  of  these  classifications  lies  in  the  breaking 
down  of  the  weapon  system  acquisition  uncertainty  problem 
into  its  component  parts.  Secondly,  the  chapter  views  the 
statistical  and  psychological  aspects  of  subjective  proba- 
bility. Subjective  probabilists  attempt  to  characterize 
the  collection  of  probability  judgements  that  are  admis- 
sible from  a normative  standpoint,  while  psychologists 
attempt  to  describe  the  actual  mechanisms  by  which  indi- 
viduals assess  probability.  Next,  the  chapter  describes 
six  techniques  for  assessing  subjective  probabilities. 
Lastly,  the  chapter  examines  what  subjective  probability 
assessment  procedures  are  required  in  weapon  system  source 
selection. 

WEAPON  SYSTEM  UNCERTAINTIES 

The  U.  S.  Air  Force  (USAF)  Academy  Risk  Analysis 
Study  Team,  sponsored  by  the  USAF  Aeronautical  Systems 
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Division  (ASD),  classified  weapon  system  uncertainties  into 
foui  categories:  target,  internal  program,  process,  and 

technical  (60:23-30).  Target  uncertainty  is  the  uncer- 
tainty involved  in  reducing  a need  to  cost,  schedule,  and 
performance  goals.  Internal  program  uncertainty  is  the 
uncertainty  inherent  in  selecting  a particular  method  or 
strategy  for  dealing  with  a given  problem.  Process  uncer- 
tainties concern  military  service  priorities,  other  weapon 
programs,  DOD  policy,  the  budget  submitted  to  Congress  by 
the  President,  and  congressional  policy  considerations. 
Technical  uncertainty  treats  the  question  of  whether  a 
system  can  be  developed  at  all,  or  the  degree  of  difficulty 
which  will  be  involved  in  building  it.  A list  of  uncertain- 
ties is  attached  as  Appendix  A for  reference  (52:63 -65). 

The  above  classification  of  uncertainty  is  from  the 
point  of  view  of  DOD  acquisition  of  weapon  systems.  Lenox 
gives  another  classification  of  the  uncertainties  involved 
in  large  government  weapon  system  programs  from  the  view- 
point of  the  contractor.  These  uncertainties  are  (58:18- 
19): 

1.  The  size,  nature,  and  timing  of  a future  market 
for  a weapon  system. 

2.  Uncertainty  of  achieving  technical  design 
objectives  within  the  constraints  of  time  and  resources 
available . 

3.  Uncertainty  concerning  whether  designs  are 


producible . 


4.  Risks  due  to  phasing  of  functional  tasks. 

5.  Tradeoffs  between  the  cost  of  meeting  the 
schedule  and  the  penalties  involved  in  not  meeting  the 
schedule . 


6.  The  resources  which  are  available  or  which  can 
be  made  available  in  taking  on  or  continuing  a weapon 
system  program. 

7.  The  degree  to  which  a contractor  contractually 
obligates  himself  to  both  the  government  and  other  companies 
as  subcontractors. 

8.  The  value  and  accuracy  of  information  which  the 
contractor's  decision  makers  can  expect  to  receive. 

Technical  uncertainty  can  be  further  broken  down 
into  those  items  you  know  you  don't  know  or  the  anticipated 
unknowns,  and  those  items  you  don't  know  you  don't  know  or 
the  unanticipated  unknowns  (58:17).  This  may  be  one  reason 
why  the  technical  aspects  of  uncertainty  have  been  over- 
emphasized in  the  past,  leading  to  inadequate  considerations 
of  other  types  of  uncertainty  in  weapon  acquisition  pro- 
grams. A study  effort  on  quantitative  risk  assessment 
f ound  that : 

Within  DOD,  there  is  little  syntactical  convention 
with  regard  to  the  term  'risk  analysis.'  On  talking 
to  members  of  other  program  offices  within  ASD,  we 
found  that  'risk  analysis'  generally  infers  analysis 
of  what  we  have  called  'technical  uncertainty'  [1:66]. 

The  Air  Force  Academy  Risk  Analysis  Team  concluded 


that : 
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. . . technical  uncertainty  was  only  the  visible 
tip  of  the  iceberg.  Submerged  beneath  the  surface 
and  frequently  having  a far  greater  impact  . . ' . 
were  a number  of  additional  uncertainties,  none  of 
them  purely  ’technical’  in  origin  . . . [60 : 22-23]. 

SUBJECTIVE  PROBABILITY 
Statistical  Aspects 

Eisner  and  McDonald  (30:52)  traced  the  evolution  of 
some  of  the  underlying  ideas  and  terms  of  subjective  proba- 
bility. Terms  that  were  of  interest  were  "degree  of  belief" 
and  "coherence." 

The  term  "degree  of  belief"  was  attributed  to 
Bernouilli.  In  1713 » he  used  the  expression  "degree  of 
confidence"  in  the  sense  that  since  the  occurrence  of  an 
event  cannot  be  predicted  with  certainty,  there  is  only  a 
"degree  of  confidence"  in  the  assertion  of  the  occurrence. 

In  1847,  DeMorgan  specified  probability  in  terms  of  a 
measurable  degree  of  belief  that  could  be  related  to 
personal  feeling.  Ramsey  and  Borel  in  the  1920’s  noted 
that  the  individual's  degree  of  belief  has  a direct  corre- 
spondence to  observable  behavior  in  decision  making.  The 
minimum  odds  that  a person  will  accept  in  a betting  situ- 
ation are  a measure  of  that  person's  degree  of  belief. 

The  property  of  coherence  introduced  by  Ramsey  insures  that 
the  person  can  never  accept  a bet  or  a set  of  bets  which  he 
can  lose.  Strict  coherence  as  attributed  to  Shimony 
requires  that  the  bettor  accept  odds  such  that  he  always 
wins  a net  amount  (30:52). 
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Provided  that  such  "degrees  of  belief"  are  assessed 
quantitatively  and  in  a coherent  manner,  the  probabilities 
so  assessed  can  be  shown  to  conform  to  the  axioms  of 
probability  theory  (26:110).  If  the  events  are  mutually 
exclusive  and  collectively  exhaustive , the  axioms  which  the 
probabilities  must  meet  are  as  follows: 

1.  The  sum  of  the  weights  assigned  to  any  set  of 
mutually  exclusive  and  collectively  exhaustive  events  is 
equal  to  1 . 

2.  The  weight  assigned  to  any  event  shall  be  a 
number  between  0 and  1,  inclusive,  0 representing  complete 
conviction  that  the  event  will  not  occur  and  1 representing 
complete  conviction  that  it  will  occur. 

3.  If  two  or  more  mutually  exclusive  events  are 
grouped  into  a single  event,  the  weight  attached  to  this 
single  event  shall  be  equal  to  the  sum  of  the  weights 
attached  to  the  original  events. 

In  subjective  probability  assessments  there  is  no 
correct  or  objective  probability:  the  probability  of  an 
event  is  what  the  assessor  believes  it  to  be.  From  the 
assessor's  viewpoint,  no  assessment  can  be  wrong  provided 
it  is  coherent:  made  with  due  care:  and  made  with  consider- 
ation of  all  known,  relevant  facts.  Subjective  probability 
theory  does  not  prescribe  what  opinions  people  should  have , 
but  rather  how  their  opinions  should  be  held  and  modified 
on  receipt  of  new  information  (49:271).  At  any  given  point 
of  time  the  decision  maker's  (or  expert's)  state  of 
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information  about  some  uncertain  quantity  can  be  represented 
by  a set  of  probabilities.  When  new  information  is  ob- 
tained, these  probabilities  are  revised  in  order  that  they 
may  represent  new  information. 

Psychological  Aspects 

De  Finetti  has  stated  thati 

The  true  . . . subjective  probability  . . . 
problem  consists  in  the  investigations  concerning 
the  ways  in  which  such  abilities  may  be  improved. 

This  seems  to  me  the  field  in  which  the  cooperation 
between  all  specialists  concerned  is  most  wanted, 
and  that  is  particularly  true  for  the  expected 
contribution  from  psychologists  [49:271]. 

Cognitive  psychology  is  that  branch  of  psychology 
which  includes  the  study  of  perception,  problem  solving, 
judgemental  processes,  thinking,  concept  formation,  and 
human  information  processing.  This  branch  of  psychology 
has  directed  its  effort  toward  under standing  the  mechanisms 
by  which  man  confronts  and  interprets  stimuli  with  which  he 
is  faced  and  particularly  toward  specifying  man's  abilities 
and  limitations  as  an  information  processing  system  (49: 
272). 

Much  of  this  cognitive  psychology  research  effort 
can  be  criticized  on  at  least  two  accounts: 

1.  Research  of  human  behavior  in  inferential,  and 
decision  making  situations  has  included  a considerable 
amount  of  experimental  work,  much  of  which  has  been  simple, 
artificial  situations.  The  artificial  nature  of  these 
situations  renders  their  generalization  to  realistic 
situations  tenuous,  thus  leaving  implications  of  the 


results  for  actual  real  world  situations  questionable 

(105:252,260) . 
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2.  Most  studies  have  involved  people  as  subjects 
who  are  neither  substantive  nor  normative  experts.  Sub- 
stantive expertise  refers  to  knowledge  which  the  assessor 
has  concerning  the  subject  matter  of  interest;  normative 
expertise  is  the  ability  of  the  assessor  to  express  his 
opinions  in  probabilistic  form  (49:272). 

The  study  of  judgemental  processes  has  produced  two 
general  conclusions: 

1.  Man  has  limited  information  processing  capacity. 

2.  The  nature  of  the  judgemental  task  with  which 
man  is  faced  determines  to  a large  extent  the  possible 
strategies  he  may  use  to  deal  with  that  task  (49:272). 

Given  only  limited  pr  cessing  capability,  man  is 
forced  to  function  in  a serial  fashion  in  that  h^  cannot 
simultaneously  integrate  a large  amount  of  information. 
Furthermore,  man  must  act  in  a selective  manner  in  order 
to  simplify  his  environment  (50:299).  The  result,  accord- 
ing to  Hogarth,  a psychologist,  is  that  man  is  a "selec- 
tive, stepwise  information  processing  system  with  limited 
capacity,  ...  he  is  ill  equipped  for  assessing  subjective 
probability  distributions  [49:273]-" 

Winkler,  a subjective  statistician,  disagrees  with 
Hogarth's  claim  that  man  is  ill  suited  to  assess  probability 
distributions.  Winkler  uses  as  an  example  of  good  perfor- 
mance documented  by  a large  body  of  data,  weather  forecasts 
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of  precipitation.  On  the  average,  subjective  precipitation 
forecasts  have  been  as  good  as  or  better  than  probability 
forecasts  determined  by  objective  means  (103:290). 

Lichtenstein  found  that  subjects  who  appear  to  do 
poorly  in  a complex  probability  estimation  exercise  may  be 
making  careful  estimates  based  on  a different  data- 
generating  model  than  the  one  used  by  experimenters. 
Lichtenstein  performed  a study  to  explore  the  ability  of 
subjects  to  estimate  probabilities.  The  study  on  the 
surface  demonstrated  that  the  subjects'  responses  differed 
from  the  correct  (true)  values,  resulting  in  the  conclusion 
that  the  subjects  were  poor  probability  estimators. 

However,  careful  examination  revealed  that  many  of  the 
subjects  were  really  careful  and  consistent  probability 
estimators — they  were  simply  using  a different  data- 
generating  model  than  the  experimenters  (59*62). 

Uncertainty  is  considered  implicitly  by  obtaining 
the  information  concerning  uncertainty  and  factoring  it 
into  the  decision  making  process  just  as  information  about 
several  other  factors  is  integrated  in  the  decision  maker's 
or  expert's  mind.  Uncertainty  is  considered  explicitly  by 
expressing  it  in  probabilistic  form  (99*163-164). 

The  opinion  an  expert  is  asked  to  reveal  is, 
implicitly  or  explicitly,  his  evaluation  of  certain 
probabilities.  This  is  particularly  brought  out  by 
Grayson,  discussing  geologists'  evaluation  of  the  success 
of  a proposed  oil-well  drilling.  Grayson  states: 
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Actually,  operators  are  obtaining  a form  of 
personal  probabilities  from  their  geologists  at 
the  present  time,  although  they  do  not  refer  to 
them  by  such  a term.  . . . 

Numbers  . . . are  merely  another  form  of 
language,  permitting  subjective  judgement  to  be 
put  into  a more  precise  form,  a form  which  is  tractable 
when  relating  the  expert's  evaluation  to  other  facets 
of  the  drilling  problem  [39 » 255]* 

Hogarth  also  notes  that  man  frequently  ignores 
uncertainty — the  reduction  or  omission  of  uncertainty 
itself  being  a useful  cognitive  simplication  mechanism 
(49:273).  Winkler  responds  by  asking  whether  uncertainty 
is  ignored  or  simply  considered  implicitly  rather  than 
explicitly  (103:291). 

Hogarth's  rejoinder  to  Winkler  is  based  upon  the 
concept  that  ignoring  uncertainty  can  be  conceived  on  two 
levels.  At  one  level,  man  in  a complex  situation  may 
mentally  replace  a set  of  uncertainties  with  their  certainty 
equivalent.  At  the  other  level,  uncertainty  causes  anxiety 
which  man  attempts  to  avoid  (51:294). 

Even  in  the  "rational"  business  world,  there  is 
evidence  that  businessmen  avoid  uncertainty.  The  studies 
of  Cyert  and  March  indicate  that  business  organizations 
avoid  uncertainty  in  two  ways: 

1 . They  solve  pressing  short-run  problems  rather 
than  developing  long-range  strategies,  thus  avoiding  the 
requirement  that  they  correctly  anticipate  future  wants. 

2.  They  avoid  planning,  where  plans  depend  upon 
prediction  of  uncertain  events,  by  imposing  standard 
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operating  procedures,  industry  tradition,  and  uncertainty- 
absorbing contracts  on  the  en-ironment  (I8sll9). 

Given  limited  information  processing  ability,  man 
structures  his  environment.  When  faced  with  the  task  of 
estimating  statistics  intuitively  from  data  in  experiments, 
subjects  have  been  fairly  accurate  at  guessing  central 
tendency  values,  but  not  variances  of  the  data  (49~s274). 

In  an  experiment  subjects  saw  samples  of  each  of 
two  populations  of  numbers  and  made  intuitive  inferences 
about  which  population  had  the  larger  variance.  They  then 
either  estimated  the  ratios  of  the  variances  or  stated 
their  confidence  (subjective  probability)  in  their 
inferences.  These  ratios  were  used  to  infer  the  subjective 
magnitudes  of  the  sample  variances.  The  inferred  ratios 
were  inaccurate  because  of  the  subjects’  tendency  to 
underweight  deviant  sample  data,  and  because  the  subjects 
regarded  the  variance  of  large  numbers  as  less  variable 
than  the  variance  of  small  numbers  (7:109). 

Subjects  also  tend  to  assess  distributions  which 
are  shaped  like  the  normal  distribution.  While  Winkler  has 
suggested  that  this  is  due  to  the  emphasis  on  the  normal 
distribution  in  statistics  courses,  Hogarth  believes  that 
it  is  due  to  the  human  perceptual  desire  for  symmetry 
(49:275). 

Until  recently,  research  concentrated  on  ascertain- 


ing how  human  judgements  deviate  from  a normative  statis- 
tical model.  However,  Kahneman  and  Tversky  have  focused 
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on  the  question  "how  do  people  evaluate  uncertainty"  rather 
than  on  "how  well  do  people  evaluate  uncertainty"  (53*^52). 
The  theme  of  Kahneman  and  Tversky's  research  is  that 
judgements  under  uncertainty  are  based  upon  a limited 
number  of  mental  operations,  or  heuristics.  When  faced 
with  the  difficult  task  of  judging  probabilities,  people 
employ  heuristics  to  reduce  these  judgements  to  simpler 
ones.  Kahneman  and  Tversky  have  described  four  such 
heuristics:  the  law  of  small  numbers,  judgement  by 

adjustment,  judgement  by  representativeness,  and  judgement 
by  availability. 

The  law  of  small  numbers  is  the  belief  that  even 
small  samples  are  highly  representative  of  the  population 
from  which  they  are  drawn.  People  expect  any  two  small 
samples  drawn  from  a particular  population  to  be  more 
similar  to  one  another  and  the  population  than  sampling 
theory  predicts  (93*105).  Kahneman  and  Tversky's  study 
of  experienced  research  psychologists  demonstrated  that  the 
psychologists  overestimated  the  statistical  significance  of 
a small  sample,  overestimated  the  replicability  of  the 
results,  and  rarely  attrib\rted  a deviation  of  the  results 
to  sampling  variability  (93*109). 

In  judgement  by  adjustment,  individuals  estimate  an 
unknown  value  by  starting  from  some  initial  value  which  is 
then  adjusted  to  yield  the  final  answer.  The  initial  value 
may  be  suggested  by  the  problem  or  may  be  the  result  of  a 
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partial  computation.  Different,  initial  values  yield  differ- 
ent final  estimates , which  are  biased  toward  the  initial 
value  (91  «153-1 54-). 

An  individual  who  follows  the  representativeness 
heuristic  evaluates  the  probability  of  an  uncertain  event, 
or  a sample,  by  the  degree  to  which  it  is  similar  to  its 
parent  population,  and  the  degree  to  which  it  reflects  the 
features  of  the  process  which  generated  it  (53*431).  When 
the  event  or  sample  in  question  is  highly  representative  of 
the  process  from  which  it  originates,  its  probability  is 
judged  high.  If  the  event  is  not  representative,  its 
probability  is  judged  low  (91*149). 

The  representativeness  heuristic  approach  leads  to 
serious  biases  because  several  of  the  factors  that  should 
be  considered  in  evaluating  probability  have  no  role  in 
judgements  of  similarity.  One  of  these  factors  is  prior 
probabilities?  another  is  the  size  of  the  sample.  To 
evaluate  the  probability  of  obtaining  a particular  result 
in  a sample  drawn  from  a specific  population,  people  assess 
the  degree  to  which  the  sample  is  representative  of  the 
population.  The  similarity  of  a sample  statistic  to  a 
population  parameter  is  unaffected  by  the  size  of  the 
sample  (91?150). 

An  individual  is  said  to  have  used  the  availability 
heuristic  whenever  he  estimates  frequency  of  probability 
by  the  ease  with  which  instances  or  associations  can  be 
brought  to  mind  (92*208).  Availability  provides  a mechanism 
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by  which  occurrences  of  extreme  utility  or  disutility  may 
appear  more  likely  than  they  actually  are. 

Like  representativeness , availability  is  useful  for 
assessing  frequency  or  probability.  Both  heuristics  use 
mental  effort  to  gauge  subjective  probability.  However, 
availability  is  affected  by  factors  such  as  familiarity, 
salience,  or  recency. 

The  major  difference  between  the  representativeness 
and  availability  heuristics  is  in  the  nature  of  the  judge- 
ment which  underlies  the  evaluation  of  subjective  probabil- 
ity. According  to  the  representativeness  heuristic,  one 
evaluates  subjective  probability  by  the  degree  of  corre- 
spondence between  the  sample  and  the  population.  On  the 
other  hand,  in  the  availability  heuristic,  subjective 
probability  is  evaluated  by  the  difficulty  of  retrieval 
and  reconstruction  of  instances.  Thus,  the  representa- 
tiveness heuristic  is  more  likely  to  be  employed  when 
events  are  characterized  by  their  general  properties; 
while  the  availability  heuristic  is  employed  when  events 
are  characterized  in  terms  of  specific  occurrences  (53*^52). 

An  important  issue  is  whether  the  expression  of 
judgement  in  probabilistic  form  is  meaningful  to  both 
assessors  and  the  recipients  of  such  opinions.  In  Hogarth's 
view,  there  are  three  necessary  conditions  for  probability 
assessment  to  be  considered  meaningful  from  the  assessor's 
viewpoint  t 
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1 . The  task  should  be  meaningful  to  the  assessor 
in  that  he  is  reasonably  familiar  with  it. 

2.  Justification  for  the  assessment  depends  on  the 
extent  to  which  it  can  predict  more  accurately  than  the 
best  available  statistical  model. 

3.  Judgements  expressed  in  probabilistic  form  are 
more  accurate  and  useful  than  those  expressed  normally 
(49*278-279). 

Another  important  issue  in  subjective  probability 
assessment  is  the  effects  of  different  methods  of  eliciting 
subjective  probability  distributions.  Winkler  has  stated: 

It  must  be  stressed  that  the  assessor  has  no 
built-in  prior  distribution  which  is  there  for  the 
taking.  That  is,  there  is  no  'true'  prior  distri- 
bution. Rather,  the  assessor  has  certain  prior 
knowledge  which  is  not  easy  to  express  quantitatively 
without  careful  thought.  An  elicitation  technique 
used  by  the  statisticians  does  not  elicit  a 'true' 
prior  distribution,  but  in  a sense  helps  to  draw 
out  an  assessment  of  a prior  distribution  from  the 
prior  knowledge.  Different  techniques  may  produce 
different  distributions  because  the  method  of 
questioning  may  have  some  effect  on  the  way  the 
problem  is  viewed  [104:778]. 

As  far  as  the  relative  merits  of  the  different 
techniques  are  concerned,  the  results  of  isolated  experi- 
ments in  the  assessment  of  continuous  probability  distri- 
butions have  been  contradictory.  However,  for  estimation 
of  discrete  probability  distributions,  an  experiment  has 
shown  reasonable  consistency  between  probability  estimates 
inferred  from  bets  and  direct  assessments  (6). 

Consistency  between  different  assessment  methods 
appears  to  be  governed  by  the  depth  of  the  assessor's 


knowledge  of  the  elicitation  method  used.  Naive  subjects 
show  inconsistencies  between  different  methods  in  assessing 
continuous  variables,  while  they  show  fairly  consistent 
responses  in  assessing  fairly  simple  discrete  events. 
Normative  experts  show  little  inconsistency  between 
different  methods  (49:279) » demonstrating  that  statistical 
training  is  of  great  help  in  assessing  probabilities 
(49:282) . 

According  to  Hogarth,  a body  of  systematic  psycho- 
logical experimental  information  in  this  area  of  subjective 
probability  is  lacking.  A major  question  is  how  to  deal 
with  the  experimental  data  to  date  in  which  naive  subjects 
have  yielded  responses  utilizing  subjective  probability 
techniques  which  they  did  not  fully  understand.  Further- 
more, the  effects  of  differences  in  personality  and  back- 
ground have  rarely  been  examined  in  the  experiments  (49: 
284).  Winkler  feels  that  cognitive  psychology  "may  have 
important  implications  for  probability  assessment,  and 
further  work  needs  to  be  done  to  investigate  such  impli- 
cations [103:290]." 

SUBJECTIVE  PROBABILITY  ASSESSMENT 
TECHNIQUES 

Techniques  for  asressing  subjective  probabilities 
are  relatively  recent;  their  application  to  statistical 
problems  has  occurred  mainly  in  the  post  World  War  II  era 
(3*17)*  The  assessment  techniques  below  have  been  developed 


for  eliciting  an  expert's  assessment  of  the  probability  of 
an  occurrence  of  an  event. 

The  Choice-Between-Gambles 
Technique 

This  technique  employs  betting-type  or  gambling 
situations  to  elicit  probability  density  or  cumulative 
probability  density  functions  (3*23).  To  obtain  a'  proba- 
bility density  function,  the  expert  is  offered  a choice 
between  a real-world  gamble  involving  values  of  the  item 
under  consideration  with  unspecified  probabilities,  and  a 
hypothetical  gamble  involving  two  events  with  given 
probabilities  (3*25).  Consider  an  example  from  Atzinger 
involving  a design  thrust  for  a jet  engine  (3*25-27). 
Initially,  the  expert  is  offered  these  choices: 

1.  Real-world  gambles — a payoff  of  $10  if  the 
thrust  reached  is  36,000  + 1,000  lb.  with  unknown  proba- 
bility, and  a payoff  of  $0  if  the  thrust  reached  is  not 

36.000  + 1,000  lb.  with  unknown  probability. 

2.  Hypothetical  gamble — a payoff  of  $10  if  event 
E^  occurs,  with  the  probability  of  E1  occurring  being  0.5 
(PtE^)  = 0.5);  or  $0  if  event  E2  occurs  with  probability 

0.5  (p(e2)  = 0.5). 

If  the  expert's  decision  in  the  first  round  is  to 
accept  the  real-world  gamble,  it  is  inferred  that  his 
subjective  probability  assessment  that  a thrust  of  36,000  + 

1.000  lb.  will  be  achieved  is  greater  than  0.5*  Thus,  in 
the  next  iteration,  the  analyst  adjusts  the  probability  of 
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occurrence  of  hypothetical  event  E^  upward  and  event  E 2 
downward;  this  procedure  is  continued  until  the  expert  is 
indifferent  to  choosing  between  the  two  gambling  situations. 
If  this  stage  is  reached  at  PCE^)  = 0.7,  P(Eg)  = 0.3,  then 
it  is  inferred  that  P( thrust  = 36,000  + 1,000  lb.)  = 0.7. 

The  thrust  value  is  then  changed  by  an  interval  chosen  so 
that  the  expert  can  discriminate  between  its  probability  of 
occurrence  and  the  previous  value;  for  example,  from 

36,000  lb.  to  34,000  lb.  This  procedure  is  continued  until 
a probability  distribution  is  obtained,  as  shown  below: 


Thrust 

32,000  + 1 ,000 

34.000  + 1 ,000 

36.000  + 1 ,000 

38,000  + 1,000 

40,000  + 1,000 


Probability 

0.0 

0.2 

0.7 

0.2 

0.0 


If  the  sum  of  the  probabilities  is  greater  than  one,  as 
above,  the  analyst  reassesses  the  expert’s  probabilities  or 
normalizes  the  derived  probabilities  by  dividing  each  one 
by  the  sum  of  all  the  subjective  probabilities  (3:27). 

In  obtaining  a cumulative  distribution  function  the 
probabilities  of  occurrence  of  E^  and  Eg  are  fixed,  while 
the  characteristic  values  of  thrust  are  changed  until 
indifference  is  achieved  ( 3 * 29 ) • Thus,  the  probabilities 
of  E^  and  Eg  would  be  fixed  at  0.5s  the  initial  indiffer- 
ence point  would  give  the  value  of  thrust  for  which  the 
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probability  was  0.5.  The  next  iteration  would  obtain  a 
value  for  thrust  which  equally  divided  each  of  these  inter- 
vals; that  is,  the  thrust  values  for  which  the  probabil- 
ities were  0.25  and  O.75.  This  procedure  is  iterated  until 
an  upper  thrust  value  is  reached  with  a probability  equal 
to  1 , as  shown  below: 


Initial  Iteration 


p(t2) 

P(V 

P(Tl) 


= 0.250 


= 0.125 


= 0.000 


P(T-,)  = 0.750 

3 i 

P(TJ  = 0.875 

I 

PtTy)  = 1.000 


The  lowest  thrust  value  is  T^,  and  Tu  is  the  upper  thrust 
value.  The  values  can  then  be  combined  in  ascending  order 
to  obtain  the  cumulative  distribution. 


The  Standard  Lottery 
Technique 

The  objective  of  this  technique  is  the  derivation 
of  a probability  density  function  over  all  possible  values 
of  a given  component  characteristic.  Like  the  Choice- 
Between-Gambles  technique,  this  technique  presents  the 
expert  with  two  gambling  situations.  However,  the  technique 
differs  from  the  Choice-Between-Gambles  technique  in  that 
it  does  not  involve  the  process  of  varying  probabilities  or 
performance  levels  until  indifference  is  achieved. 


Instead,  the  number  of  lottery  tickets  from  a pool  of  100 
is  varied  in  an  attempt  to  achieve  expert  indifference. 

The  technique  is  based  upon  the  following  standard 
lottery  procedure.  In  a lottery  with  100  tickets,  a con- 
testant can  purchase  as  many  tickets  as  he  desires;  the 
greater  the  number  purchased  the  greater  the  chance  of  his 
winning.  After  the  purchase  of  tickets  is  completed,  one 
random  number  between  1 and  100  is  drawn.  The  winning 
contestant  is  that  individual  who  purchased  the  lottery 
ticket  with  that  number. 

In  this  technique  the  expert  is  presented  with  a 
hypothetical  lottery  of  100  tickets.  The  lottery  is  used 
as  a standard  of  comparison  in  helping  the  expert  decide 
what  probability  value  to  assign  to  the  possible  realiza- 
tion of  a given  characteristic  level  of  an  event.  The 
questioning  procedure  is  as  follows: 

1.  Specify  a possible  value  (e.g. , thrust  = 36,000 
lbs.)  for  the  relevant  real  world  event  (e.g.,  experimental 
jet  engine  thrust). 

2.  Have  the  expert  imagine  that  he  is  given  a 
choice  between  a certain  number  of  tickets  in  a standard 
lottery  with  a prize  value  V and  the  right  to  receive  the 
same  prize  if  the  value  of  the  real  world  event  is  realized 
(e.g.,  jet  engine  thrust  = 36, 000  lbs.). 

3.  For  a given  initial  number  of  lottery  tickets, 

ask  the  expert  which  alternative  gamble  he  feels  has  the 
greatest  chance  of  winning  the  prize:  a)  the  holding  of 
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the  specified  number  of  tickets  of  a lottery  with  100 
tickets  outstanding,  or  b)  the  realization  of  the  value  of 
the  real-world  event, 

4.  If  one  of  the  alternative  gambles  is  preferred 
over  the  other,  next  vary  the  number  of  tickets  (e.g. , 
increase  the  number  if  the  expert  chooses  the  real-world 
event  in  step  3«  decrease  the  number  if  he  chooses' the 
lottery  alternative)  and  repeat  step  3* 

5.  Repeat  steps  3 and  4 until  the  expert  feels 
that  the  possibility  of  receiving  the  prize  for  the  value 
of  the  event  (engine  thrust  = 36,000  lbs.)  has  exactly  the 
same  likelihood  as  say,  70  tickets  in  the  standard  lottery. 
Thus  it  can  be  inferred  that  the  expert  considers  both 
alternatives  equally  likely,  and  a probability  of  0.7  can 
be  assigned  to  the  event  thrust  value  = 36,000  lbs. 

6.  Using  steps  1 through  5»  the  expert  can  proceed 
to  similarly  assign  probabilities  to  all  other  possible 
values  of  the  real  world  event  (3*34-36). 

The  Modified  Churchman-Ackoff 
Technique 

This  technique  differs  from  the  preceeding  tech- 
niques in  that  it  does  not  involve  betting  situations,  and 
the  expert  is  not  asked  to  reveal  indifference  values  of  the 
parameter  in  question.  The  expert  is  instead  asked  to 
make  "greater  than,"  "equal  to,"  or  "less  than"  evaluations 
regarding  relative  probabilities  between  two  sets  of  values 
and  relative  probability  assessments  with  respect  to  the 
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most  probable  characteristic  value.  The  resultant  relative 
probability  scale  is  easily  transformed  into  a probability 
density  function. 

With  this  technique,  the  expert  must  decide  upon 
a range  of  possible  values  which  the  relevant  event  could 
realize.  He  must  specify  the  values  for  the  range  end 
points  which  have  zero  probability  of  occurrence.  - 

Next,  individual  values  within  the  range  of  possible 
values  must  be  determined.  These  values  are  determined 
through  the  following  approach! 

1.  Start  with  the  smallest  end  point  value. 

2.  Progress  upward  from  the  smallest  end  point 
value  until  the  expert  is  able  to  state  a simple  preference 
regarding  the  relative  probabilities  of  occurrence  of  the 
two  values.  If  the  expert  believes  that  one  or  the  other 
has  a greater  chance  of  occurrence  than  the  other  of  the 
two  values,  it  can  be  inferred  that  the  expert  is  able  to 
discriminate  between  the  two  values. 

3.  Using  the  higher  of  the  two  previously  iden- 
tified values,  repeat  step  2 to  determine  the  next  value 
within  the  range. 

4.  Repeat  steps  2 and  3 until  the  high  end  point 
of  the  range  of  values  is  approached. 

For  example,  using  this  procedure  for  the  thrust 
of  a jet  engine  in  development,  the  results  in  Table  1 
are  obtained. 
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TABLE  1 

Range  of  Values  for  Thrust  Example 


Tt  = 35.000 
T2  = 36,000 
T3  = 37,500 
T4  = 38,500 
T^  = 40,000 
T6  = 41,000 
T?  = 41,500 


The  descending  order  of  probability  of  occurrence 
for  each  value  can  be  determined  by  applying  the  following 
paired  comparison  method. 

Ask  the  expert  to  compare  the  first  value  T^  to 
each  of  the  other  values,  and  state  a preference  for  the 
value  in  each  pair  that  he  believes  has  the  greater  chance 
of  occurring  (denoting  a greater  probability  by  >,  an  equal 
chance  by  =,  and  a lesser  chance  by  <),  The  following 
hypothetical  preference  relationships  could  result  from 
the  set  of  seven  values  in  Table  1«  T^  < T2,  T^  < T^, 

T1  < V T1  < T5’  T1  < T6’  T1  = T7* 

Next,  ask  the  expert  to  compare  the  second  value 
T2  to  each  of  the  other  values  succeeding  it  in  the  set. 
Continue  the  process  until  all  values  have  been  compared 
to  the  others  in  like  manner.  Table  2 lists  the  preference 
relationships  which  might  result. 
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TABLE  2 

Paired  Comparisons 


T1 

T2 

T3 

t4 

T5 

T6 

T1  < T2 

t2  < T3 

t3  < t4 

% > t5 

t5  >t6 

t6  =■  t7 

T1  < T3  T2  < t4  T3  > T5  t4  > t6  t5  > T7 


T1  < t4  T2  < T5  T3  > T6  T4  > T7 


T1  * T5  T2  > t6  T3  > T7 


T1  < t6  T2  > T? 


T1  * T7 


Now  total  the  number  of  times  each  T^  value  was 


preferred  over  the  other  values.  The  results  of  this 
procedure  are  listed  in  Table  3*  List  the  thrust  values 
in  descending  order  of  preference,  and  change  the  symbols 


for  each  value  from  T.  to  X.,  as  shown  in  Table  4. 
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TABLE  3 

Summary  of  Preference  Relationships 


T^  = 6 times 
= 5 times 
T^  = 4 times 
T2  = 3 times 
T^  = 2 times 
T^  = 0 times 
Tr,  = 0 times 


TABLE  4 

Transformations 


38 

Ox 

O 

O 

(V 

37 

O 

o 

xrx 

(t3) 

4o 

,000 

(t5) 

36 

,000 

(t2) 

41 

,000 

(T6) 

35 

,000 

(Tt) 

41 

,500 

(T«) 

35 


Arbitrarily  assign  a rating  of  100  points  to  the 
thrust  value  with  the  highest  subjective  probability,  or 
preference  rating  (e.g.,  X^ ) . Then,  as  in  the  first  step, 
question  the  expert  regarding  the  relative  chance  of 
occurrence  of  each  of  the  other  values  on  the  ordinal  scale 
in  Table  4 with  respect  to  the  values  above  them  on  the 
S'ale.  Assigning  X^  a rating  of  100  points,  the  expert  is 
interrogated  as  to  his  feeling  of  the  relative  chance  of 
occurrence  of  the  second  highest  scale  value,  X2>  with 
respect  to  X^ . For  example , if  the  expert  decides  that 
X2  has  . 8 as  much  chance  of  occurring  as  Xj^ , the  ratings 
become  X^  = 100  points  and  X2  = 80  points. 

The  expert  is  then  questioned  about  the  relative 
chance  of  occurrence  of  the  next  highest  scale  value  X^ , 
first  with  respect  to  X1 , then  with  respect  to  X2>  The 
resulting  numerical  ratings  should  concur.  If  the  expert 
expresses  a belief  that  X^  has  .5  as  much  chance  as  X^  of 
occurring,  and  5/8  as  much  chance  as  X2  (as  a validity 
check) , this  confirms  that  the  relative  probability  of 
occurrence  rating  for  is  50  points. 

Continue  the  process  for  each  remaining  value  on 
the  ordinal  scale  in  Table  4.  Determine  the  relative 
number  of  points  to  be  awarded  each  value  with  respect  to 
the  top  scale  value,  and  with  respect  to  all  other  values 
on  down  the  scale  which  are  above  the  value  in  question. 

In  the  event  of  disparities  between  relative 
probability  ratings  for  a certain  value,  the  expert  should 
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be  asked  to  reevaluate  his  relative  ratings.  If  this  is 
not  successful,  the  average  of  all  such  ratings  for  a 
given  X^  value  should  be  computed,  and  used  as  a relative 
probability  rating. 

As  a result  of  the  above  process,  the  relative 
probability  ratings  shown  in  Table  5 might  be  attained. 

TABLE  5 

Relative  Probability  Ratings 

RX^  = 100  points 
RX2  = 80  points 

RX-j  = 50  points 

RX^  = 25  points 

RXc-  = 10  points 

RX^  = 0 points 

RXy  = 0 points 


Finally,  the  scale  of  relative  probability  values 
can  be  converted  to  probability  density  values,  using  the 
following  relationships  (where  P(X^)!s  are  probabilities 
and  the  RX^'s  are  relative  probability  ratings): 


RX. 

P(X.)  = — P(Xt) 
1 BXX  x 


Z P(XL)  =1  for  i 


for  i = 2, 3, 4. 5, 6, 7 

= 1,2, 3, 4, 5, 6, 7 
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These  two  relationships  will  give  a system  of  linear 
equations  from  which  the  values  of  each  P(X^)  can  be 
computed.  The  results  of  the  example  are  shown  in  Table  6 
(306-43). 


TABLE  6 

Probability  Density  Function 


Symbol 

Thrust  Value 

Probability 

X1 

38,500 

0.377 

x2 

37,500 

0.301 

X3 

40,000 

0.189 

X4 

36,000 

0.095 

X5 

41 ,000 

0.038 

x6 

35.000 

0.000 

X7 

41 ,500 

0.000 

1.000 

The  Delphi  Technique 

The  Delphi  technique  is  an  alternative  to  the 
committee  approach  for  eliciting  group  judgement.  It 
attempts  to  improve  this  judgement  by  removing  barriers  to 
effective  group  judgement,  such  as  dominant  individuals, 
communication  "noise,"  and  group  pressure  toward 
conformity  (21 iv). 
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A group  probability  assessment  may  be  obtained  with 
this  technique.  Anonymous  individual  probability  responses 
are  elicited  from  each  group  member  using  questionnaires 
(or  a similar  device).  The  group  response  is  collected  by 
analysts , who  summarize  the  responses  and  feed  them  back 
to  the  group.  Individuals  may  be  asked  to  justify  assess- 
ments which  are  extreme  in  comparison  to  the  group.  The 
group  members  are  requested  to  make  any  probability  assess- 
ment changes  they  desire  based  upon  the  feedback  presented 
to  them.  This  procedure  is  iterated  until  group  members 
make  no  further  changes  in  their  assessments.  The  analysts 
then  summarize  and  define  the  final  group  response,  usually 
by  averaging  final  responses,  or  selecting  the  median 
response  (3*24). 

An  example  follows  to  clarify  the  Delphi  Technique 

(72:334-335) * 

First,  suppose  we  wish  to  arrive  at  how  large  the 
thrust,  T,  for  an  experimental  jet  engine  will  be.  The 
following  steps  are  involved: 

1 . Ask  each  expert  independently  to  give  an 
estimate  of  T,  arrange  responses  in  order,  and  group  them 
in  quartiles  , M,  as  shown: 


m m 

x10  Ali 


A L 


J L 


1 


1 


+ 

M 


j L 
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2.  Communicate  , M,  and  0,^  to  each  expert,  ask 
him  to  reconsider  his  previous  estimate,  and  if  his  estimate 
(old  or  revised)  lies  outside  the  interquartile  range 

to  'to  state  his  reason. 

3.  Communicate  the  results  of  this  second  round, 
plus  the  reasons,  to  the  respondents  in  summary  form,  and 
ask  for  new  estimates  and  arguments. 

4.  Repeat  steps  2 and  3 until  no  further  changes 

are  made  by  the  respondents,  or  the  dispersion  among  j 

responses  is  acceptable.  Take  the  median  to  represent  the 
group  decision  as  to  the  value  of  T.  The  "smallness"  in 
dispersion  depends  on  the  criticality  of  T for  the  desired 
purpose . 

The  DeGroot  Consensus 
Method 

DeGroot  has  proposed  a model  for  reaching  group 
consensus.  Each  group  member,  or  expert,  must  first  assess 
a probability  distribution  for  the  unknown  value  of  some 
parameter.  He  is  then  confronted  with  the  probability 
distributions  of  the  other  group  members , and  revises  his 
own  opinion  in  light  of  the  others  by  making  an  assessment 
of  each  group  member’s  relative  importance,  expertise,  etc. 

Given  that  each  group  member  revises  his  opinion  in  this 
manner,  to  be  consistent  each  member  should  update  his  own 
probability  in  response  to  revisions  made  by  the  other 

group  members.  The  process  continues  until  further  revision  j 

no  longer  occurs.  DeGroot  showed  that  this  process  can  be 

, 


interpreted  within  the  theory  of  Markov  chains,  the  limit 
theorems  of  which  can  be  used  to  see  whether  a consensus 
distribution  exists,  and  if  so,  what  it  is  (27*118-21). 
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The  Direct  Estimation 
Technique 

This  technique  results  directly  in  a probability 
density  function  as  the  individual  expert  approximates  the 
probabilities  without  benefit  of  one  of  the  techniques  for 
inferring  probabilities  that  were  described  earlier. 


APPLICATION  OF  SUBJECTIVE  PROBABILITY 
ASSESSMENT  TO  WEAPON  SYSTEM 
SOURCE  SELECTION 

Little  information  has  been  found  concerning  the 
extent  to  which  subjective  probability  assessment  techniques 
have  been  utilized  in  the  weapon  system  acquisition  process 
due  to  the  confidentiality  involving  source-selection 
documents.  DOD  Directive  4105.62,  "Selection  of  Contractual 
Sources  for  Major  Defense  Systems,"  establishes  DOD 
objectives,  principles  and  policy  for  the  evaluation  of 
proposals  in  the  selection  of  contractual  sources  for  each 
system/project.  In  accordance  with  the  Directive,  Govern- 
ment solicitations  for  contractor  proposals  must  require 
the  competitors  to  identify  technical  risks  and  uncertain- 
ties and  suggest  realistic  approaches  to  their  resolution 
(95*11)*  The  Directive  does  not  go  into  detail  on  any 
recommended  subjective  probability  assessment  procedure  for 


Air  Force  Regulation  (AFR)  70-15.  "Source  Selection 


Policy,"  22  June  1973.  implements  DOD  Directive  4105.62 
within  the  Air  Force,  and -is  the  basic  Air  Force  Directive 
pertaining  to  source-selection  actions  (97).  Air  Force 
Manual  70.6,  "Source  Selection  Procedures,"  22  June  1973. 
defines  the  guidelines  contained  in  AFR  70-15  and  prescribes 
general  procedures  to  be  employed  in  major  source  selection 
actions  (96).  Both  the  Regulation  and  the  Manual  call  for 
the  source-selection  process  to  focus  attention  on  technical 
risk  and  uncertainties  for  major  developmental  programs. 

The  Air  Force  solicitation  to  potential  contractors  should 
identify  potential  areas  of  high  risk  if  there  is  a reason 
to  believe  that  the  risks  are  not  generally  known  to  the 
offerors.  The  offerors  should  identify  technical  risks 
associated  with  their  proposals  and  the  possible  impacts 
on  cost,  schedule,  or  performance,  together  with  realistic 
approaches  to  their  resolution.  Risk  analysis  is  identified 
as  part  of  the  source-evaluation  process,  and  risk  assess- 
ments for  each  proposal  must  be  included  in  all  reports 
to  the  Source  Selection  Advisory  Council.  Technical  risk 
should  be  an  evaluation  criteria  element,  and  should  be 
rated  based  upon  the  offeror's  risk  assessment  and  the 
credibility  of  the  offeror's  proposed  approach  for  elimi- 
nation or  avoidance  of  the  risk  (96*3-5  "to  3-65  97*4-5). 
Neither  the  Regulation  nor  the  Manual  go  into  any  detail  on 
any  recommended  subjective  probability  assessment  procedure. 
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Air  Force  Systems  Command  Regulation  70-9 » "Source 
Selection  Procedures,"  16  August  1974,  addresses  itself  to 
procedures  for  source-selection  actions  below  the  dollar 
thresholds  established  by  AFR  70-15  (94).  This  Regulation 
leaves  to  the  discretion  of  the  Source  Selection  Authority 
whether  the  evaluation  and  rating  system  of  the  offeror's 
proposals  should  be: 

1 . Substantially  and  predominantly  mathematical 

scoring; 

2.  Partial  scoring  and  partial  subjective  assess- 
ment ; or 

3.  Wholly  subjective  analysis  and  assessments. 

The  Regulation  relates  that  there  has  been  a swing 

from  numerical  scoring  to  rating  proposal  elements  using 
color-coded  narrative  assessments  for  briefings.  The 
following,  as  described  by  the  Regulation,  is  a subjective 
assessment  color-coding  system  frequently  used: 

1 . Green — exceeds  specified  performance  or  capa- 
bility and  excess  is  useful;  high  probability  of 
success;  no  significant  weakness; 

2.  Blue — averages  meets  most  objectives;  good 
probability  of  success;  deficiencies  can  be 
corrected; 

3.  Yellow — weak;  low  probability  of  success; 
significant  deficiencies;  but  correctable; 

4.  Red — key  element  fails  to  meet  intent  of  the 
Request  for  Proposal  (RFP)  [94:45]. 

The  ASD  Handbook  on  the  source-selection  process 
(98)  states  that  a risk  analysis  will  be  performed  during 
source  selection,  but  it  also  fails  to  go  into  detail  on 
any  recommended  subjective  probability  assessment 


Chapter  3 

RESEARCH  METHODOLOGY 
OVERVIEW 

The  research  team  evaluated  the  subjective  prob- 
ability assessment  techniques  using  content  analysis.  The 
population  considered  in  the  study  consisted  of  the  sub- 
jective probability  techniques  defined  in  Chapter  2.  The 
population  and  sampling  plan,  content  analysis,  coding  plan 
and  categories,  pilot  study,  reliability  of  the  code, 
coding  and  summarization  of  the  units  of  content,  summary 
of  assumptions,  and  summary  of  limitations  are  described 
below. 

POPULATION  DESCRIPTION  AND  SAMPLING  PLAN 

The  population  of  subjective  probability  techniques 
was  sampled  by  reviewing  pertinent  subjective  probability 
literature  as  identified  in  Table  7*  The  sample  time 
period  of  convenience  covered  was  from  January,  i960,  to 
the  present.  The  sampling  plan  was  a nonrepresentative 
sample  of  convenience.  The  researchers  attempted  to  reduce 
the  bias  resulting  from  the  sampling  plan  by  sampling, 
within  the  time  constraints  placed  on  the  research  team,  as 
broad  a range  of  literature  as  possible  in  terms  of  time, 
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application,  and  theory.  A listing  of  the  sources  used  in 
the  content  analysis,  by  technique,  can  be  found  in 
Appendix  B. 

TABLE  7 

Sampling  Sources 

Source  Index  Identifiers 


Books — in  Libraries  of« 

AFIT  Engineering  School 
AFIT  School  of  Systems 
and  Logistics 
Wright  State  University 

Business  Periodicals  Index 

Defense  Documentation 

Center/Defense  Logistics 
Studies  Information  Center 


Mathematical  Reviews 
Psychological  Abstracts 


Statistical  Theory  and 
Methods  Abstracts 


Decision  Theory  (Library  of 
Congress  Codes  HD  38,  HD  69) 
Probability  Theory  (QA  273-279) 
Psychological  Theory  (BF  38-39. 
BF  441) 

Probabilities 

Cost  Uncertainty  Analysis 
Decision  Making 
Decision  Theory 
Delphi 

Prediction  Methods/Forecasting 
Probability  Theory 
Risk  Analysis 

Probability 

Cognitive  Processes  and 
Motivation 

Decision  and  Choice  Behavior 
Decision  and  Information  Theory 
Learning  and  Memory 
Learning,  Thinking,  and 
Conditioning 

Probability 
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CONTENT  ANALYSIS 

The  methodology  which  was  selected  to  answer  the 
research  question  and  meet  the  research  objective  was 
content  analysis.  Content  analysis  has  been  variously 
described  as  a "research  technique  for  the  objective, 
systematic,  and  quantitative  description  of  the  manifest 
content  of  communication  [8:18],"  and  as  "a  procedure  of 
classification,  summarization,  and  tabulation  [34:646], " 

The  most  basic  distinction  made  in  content  analysis 
is  between  content  analysis  done  at  the  manifest  level,  and 
content  analysis  done  at  the  latent  level.  Content  analysis 
at  the  manifest  level  is  analysis  of  what  the  material 
being  analyzed  literally  stated.  In  contrast,  content 
analysis  at  the  latent  level  goes  beyond  what  the  material 
being  analyzed  said  literally  in  order  to  make  inferences 
about  what  the  material  implied  or  meant.  Evidence  has 
indicated  that  content  analysis  at  the  manifest  level  is 
reliable  and  valid;  this  was  not  true  of  content  analysis 
at  the  latent  level  (34:647-48).  The  research  effort  used 
content  analysis  at  the  manifest  level. 

There  were  three  basic  steps  in  applying  the  content 
analysis  technique:  1)  deciding  what  the  unit  of  content, 

or  material  to  be  categorized,  would  be;  2)  developing  the 
set  of  categories;  and  3)  developing  a coding  rationale  to 
guide  the  placement  of  the  unit  of  content  into  categories 
(34:649). 
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The  first  step  was  to  select  what  is  called  the  unit 
of  content,  or  material  to  be  categorized.  Generally,  this 
involved  making  a choice  between  using  each  sample  in  total, 
or  breaking  each  sample  down  into  the  separate  wox Is  or 
phrases  which  make  it  up  (34*649,651).  If  it  were  decided 
to  use  each  sample  in  total  as  the  unit  of  content,  each 
sample  would  be  read  completely  and  categorized  on  the 
basis  of  everything  it  contained.  If  the  separate  words  or 
phrases  in  each  sample  were  used  as  the  unit  of  content, 
each  separate  word  or  phrase  that  indicated  some  specific 
perception  of  what  was  being  analyzed  would  be  categorized 
separately.  The  research  used  the  phrase,  here  described 
as  a word  or  group  of  words  that  form  a.  unit  expressing  a 
perception,  as  the  unit  of  content.  Each  phrase  from  the 
sample  of  previously  identified  subjective  probability 
techniques  that  indicated  a specific  perception  dealing 
with  one  of  the  criteria  categories  specified  below  was 
coded. 

To  do  the  content  analysis  a set  of  categories  and 
a method  of  coding  was  needed.  In  addition  to  reliability 
and  validity,  the  desirable  attributes  of  a set  of  cate- 
gories were  homogeneity,  inclusiveness,  usefulness,  and 
mutual  exclusiveness.  Homogeneity  is  the  property  that 
each  level  of  categories  is  similar  in  content  and  level  of 
abstraction.  Inclusiveness  is  the  requirement  that  every 
unit  of  content  be  classified.  Usefulness  is  defined  as 
reflecting  the  fact  that  each  category  delineates  a 
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meaningful  dimension  of  the  variable  under  study.  Mutual 
exclusiveness  is  the  attribute  that  there  be  one  place  and 
only  one  place  to  code  any  one  unit  of  content  ( 34  * 675-77). 
Minimally,  the  code  should  have  face  validity  (34:672). 

There  were  three  aspects  to  each  unit  of  content  to 
be  categorized  by  coding:  the  subjective  probability 

technique,  the  criteria  category,  and  the  negative-positive 
valence.  Valence  expressed  feeling  tone,  or  how  strong  an 
element  of  personal  statement  was  contained  in  the  unit  of 
content  (34:659)*  Since  there  were  three  different  aspects 
to  code,  the  code  consisted  of  three  digits?  the  first 
digit  dealt  with  the  specific  probability  technique,  the 
second  digit  with  the  criteria  category,  and  the  third 
digit  with  the  negative-positive  valence.  The  coding  plan 
and  categories  established  for  this  research  are  shown  in 
Table  8. 

Coding  Plan  and 
Categories 

The  first  digit  in  the  coding  assigned  the  unit  of 
content  to  one  of  the  subjective  probability  techniques 
being  evaluated.  For  example,  if  the  unit  of  content  being 
coded  contained  a perception  dealing  with  the  Choice- 
Between-Gambles  technique,  its  coding  began  with  the  digit 
"1."  This  first  level  of  categorization  had  all  the  pre- 
viously defined  desirable  attributes  of  a set  of  categories, 
in  that  the  subjective  probability  techniques  being  evalu- 
ated, with  their  corresponding  first-digit  code,  were  as 


w 
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TABLE  8 

• 

Code  for  Content  Analysis  of  Subjective 
Probability  Techniques 

First  Digits  Subjective  Probability  Technique 

1 —  Choice-Be tween-Gambles 

2 —  Standard  Lottery 

3 —  Modified  Churchman-Ackoff 

4 —  Delphi 

5 —  DeGroot  Consensus  Method 

6 —  Direct  Estimation 

Second  Digit:  Criteria  Category 

1 —  Ease  of  Application 

2 —  Adaptability  and  Flexibility 

3 —  Validity  and  Reliability 

4 —  Time 

5 —  Removal  of  Bias 

6 —  Miscellaneous 

Third  Digit!  Valence  Category 


1 —  Negative 

2 —  Mixed 

3 —  Positive 
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follows:  (1)  Choice-Between-Gambles,  (2)  Standard  Lottery, 

(3)  Modified  Churchman-Ackoff , (4)  Delphi,  (5)  DeGroot 
Consensus  Method,  and  (6)  Direct  Estimation. 

The  second  digit  in  the  coding  was  used  to  identify 
the  criteria  category.  The  categories  served  as  a means  of 
summarizing  the  relative  attributes  of  the  subjective 
probability  techniques.  For  example,  if  the  unit 'of  content 
being  coded  contained  a perception  pertaining  to  the  ease 
of  application  of  the  Choice-Between-Gambles  technique,  its 
coding  started  with  the  digits  "11."  The  criteria  cate- 
gories also  contained  the  previously  defined  desirable 
attributes  of  a set  of  categories:  homogeneity,  in  that 

they  were  related  to  one  another  as  criteria  to  evaluate 
the  subjective  probability  technique;  and  usefulness, 
reflecting  the  fact  that  each  criteria  category  served  a 
purpose  and  delineated  a criterion  of  the  subjective 
probability  techniques  under  study. 

In  order  to  make  the  criteria  categories  inclusive, 
a miscellaneous  category  was  added  as  a way  of  including 
those  units  of  content  which  did  not  fit  a predetermined 
criteria  category.  After  initial  coding,  the  miscellaneous 
criteria  category  was  surveyed  for  similar  units  of  content. 
If  these  similar  units  of  content  constituted  five  percent 
of  the  total  units  coded,  a new  criteria  category  was  added, 
and  the  similar  units  of  content  were  placed  in  that  new 
criteria  category. 
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Each  subjective  probability  technique  was  assessed 
with  respect  to  the  following  criteria  categories  (see 
Appendix  C for  adjectives  and  phrases  used  to  code  the  units 
of  content  to  each  criteria  category) : 

1 . Ease  of  application — this  criteria  category 

included:  the  level  of  expertise  required  of  the  analysts 

in  preparing  and  administering  the  technique  and  in 
analyzing  the  results  obtained  from  the  technique;  the 
requirements  for  defining  and  selecting  respondents 
(experts);  the  training  needed  for  analysts  or  respondents 
to  use  the  technique;  the  money  cost  of  the  technique;  and 
any  unique  equipment  or  facilities  needed  to  apply  the 
technique. 

2.  Adaptability  and  flexibility — this  criteria 

category  addressed  the  utility  of  the  technique  to  the 
user  in  terms  of:  its  application  to  more  than  one  use; 

the  limitations  of  the  technique;  the  changeability  of  the 
procedures  of  the  technique;  the  potential  usefulness  of 
the  technique;  and  the  ability  of  the  technique  to  handle 
changes  in  the  problem  u*  der  consideration. 

3.  Reliability  and  validity — reliability  here 
means  that  similar  groups/individuals  would  make  the  same 
assessment  using  the  technique.  Validity  pertains  to  the 
accuracy  of  the  technique , or  how  close  the  results  obtained 
from  the  technique  are  to  what  occurs  in  the  real  world. 

This  area  included:  the  replicability  of  the  technique; 

the  validity  of  the  methodology  and  data  obtained;  the 
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precision  and  objectivity  of  the  techniques  the  relative 
effectiveness  of  the  technique  compared  to  other  tech- 
niques; the  validity  of  the  technique  in  dealing  with  real 
problems;  and  evidence  demonstrating  the  reliability  and 
validity  of  the  technique. 

4.  Time — this  criteria  category  addressed  the  time 
factor  involved  in  using  the  technique,  and  included  the 
length  of  time  required  to  apply  the  technique;  the  time 
required  to  process  the  results;  and  the  time  needed  to  get 
the  information  obtained  from  the  technique  to  the 
appropriate  decision  makers. 

5.  Removal  of  bias — bias  in  this  category 
referred  to  the  bias  introduced  in  the  results  because  of 
individual/group  interactions  among  respondents ; analyst  or 
administrator  bias;  user  bias;  procedural  bias;  and  bias 
due  to  information  or  the  lack  of  information  obtained  using 
the  technique. 

6.  Miscellaneous — this  criteria  category  included 
any  units  of  content  not  clearly  belonging  in  the  other 
criteria  categories. 

The  third  digit  in  the  coding  recorded  the  negative- 
positive valence  with  which  a unit  of  content  related  a 
criteria  category  to  a subjective  probability  technique. 

If  the  unit  of  content,  for  example,  contained  a negative 
perception  dealing  with  the  ease  of  application  of  the 
Choice-Between-Gambles  technique,  the  coding  was  "111." 

The  valence  categories  also  satisfied  the  desired 


attributes  of  a set  of  categories.  To  standardize  the 
application  of  the  valence  categories,  they  (along  with 
their  coding)  were  defined  as  follows: 
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(1)  Negative — a unit  of  content  containing  only 
perceptions  of  unacceptance  or  disapproval. 

(3)  Mixed — a unit  of  content  containing  both 
negative  and  positive  elements. 

(5)  Positive — a unit  of  content  containing  only 
perceptions  of  acceptance  or  approval. 

A list  of  adjectives  and  phrases  which  express 
negative,  mixed,  or  positive  valences  can  be  found  in 
Appendix  D. 


Pilot  Stud^ 


Since  satisfactory  reliability  cannot  always  be 


achieved,  a pilot  study  is  critical  in  any  research  in 
which  content  analysis  will  be  used  (3^:648).  The 
researchers  each  performed  a pilot  study  of  Elsbernd's 
unpublished  research  study,  "The  Use  of  the  DELPHI  Method 
Within  the  Defense  Department"  ( 31 ).  This  pilot  study  was 
performed  in  order  to  provide  an  estimate  of  the  success 
the  researchers  would  have  if  they  used  their  version  of 


content  analysis  to  answer  the  research  question,  and  in 
order  to  validate  the  reliability  of  the  code.  The  results 


of  the  pilot  study  are  discussed  below  under  "Reliability 
of  the  Code." 
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Reliability  of  the  Coders 

In  order  to  estimate  the  reliability  of  any  single 
coder,  it  is  necessary  to  have  developed  a standard  set  of 
approximately  100  coded  units  of  content  to  which  a number 
of  individuals  have  agreed.  The  coder  whose  reliability 
is  to  be  estimated  then  codes  this  standard  set  of  units 
of  content  (34:670).  Due  to  the  lack  of  a standardized 
coding  test  for  this  thesis  subject,  it  was  assumed  for  the 
purpose  of  this  research  that  both  researchers  were 
reliable  coders. 


Reliability  of  the  Code 

Reliability  of  the  content  analysis  code  was 
estimated  by  computing  the  percent  of  time  that  the  two 
researchers  agreed  when  they  each  coded  the  same  sample  of 
27  units  of  content  during  the  pilot  study.  The  percent 
of  agreement  was  computed  as  follows: 


Percent  agreement  = 


Number  of  units  coded  identically 
Total  number  of  units  coded 


With  the  three-digit  code  being  used  in  this  research, 

85  percent  agreement  was  a realistic  expectation  (34: 
669-670).  After  initially  coding  the  same  material,  the 
researchers'  percent  agreement  was  67  percent.  Accordingly, 
the  research  team  met  to  explain  their  rationale  for  their 
own  coding  of  the  data.  Once  the  nature  of  the  discrepancy 
was  determined,  the  categories  were  reworded  to  eliminate 
the  possibility  of  alternative  interpretations.  After  a 
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second  coding  of  the  same  material  by  each  researcher,  the 
percent  agreement  was  93  percent.  The  code  thus  met  the 
85  percent  agreement  criterion.  The  initial  and  recoded 
content  analysis  results  are  shown  in  Table  9. 


TABLE  9 


Summary  of  Pilot  Study 


Unit  of 
Content  Page 

Initial 

Coder 

1 

Codes 

Coder 

2 

Final 

Coder 

1 

Codes 

Coder 

1 

25 

455 

4 55 

455 

4 55 

2 

27 

455 

* 

453 

4 55 

* 453 

3 

37 

413 

* 

411 

411 

411 

4 

38 

411 

411 

411 

411 

5 

38 

441 

441 

441 

441 

6 

39 

411 

* 

461 

411 

411 

7 

40 

461 

461 

461 

461 

8 

40 

415 

* 

463 

415 

415 

9 

41 

465 

465 

465 

4 65 

10 

41 

465 

465 

465 

465 

11 

41 

411 

411 

411 

411 

12 

41 

441 

441 

441 

441 

13 

41 

441 

441 

441 

441 

14 

41-42 

441 

441 

441 

441 

15 

42 

441 

441 

441 

441 

16 

42 

411 

* 

461 

411 

411 

17 

42 

411 

* 

461 

411 

411 

18 

42 

411 

* 

461 

411 

411 

19 

42 

461 

461 

461 

461 

20 

42 

411 

* 

461 

411 

* 461 

21 

43 

461 

461 

461 

461 

22 

42 

431 

* 

461 

461 

461 

23 

62 

441 

441 

441 

441 

24 

26 

411 

411 

411 

411 

25 

41 

415 

415 

415 

415 

26 

41 

415 

415 

415 

415 

27 

42 

461 

461 

461 

461 

* Indicates  disagreement 

Initial  percent  agreement  = 18/27  = 67  percent 
Final  percent  agreement  = 25/27  = 93  percent 
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Coding  and  Summarization 
of  the  Units  of  Content 

In  the  actual  process  of  the  content  analysis, 
units  of  content  from  the  semple  were  coded  by  the  research 
team.  Once  the  units  of  content  were  fully  coded,  they 
were  summarized  by  the  researchers.  The  two  levels  of 
summarization  which  were  obtained  using  the  content 
analysis  are  displayed  in  Figure  1 . 

The  first  level  of  summarization  obtained  by  the 
researchers  was  the  relative  frequency  of  occurrence  of 
each  three-digit  code  with  the  same  first  two  digits. 

The  frequency  of  codes  111,  113,  and  115  was  computed 
relative  to  each  other,  etc.  Based  upon  the  three-digit 
code  with  the  highest  relative  frequency,  each  two-digit 
code  was  assigned  a one,  three,  or  five  valence.  In  cases 
of  ties  the  average  of  the  tying  valences  was  assigned.  If 
code  111  had  the  highest  frequency  relative  to  113  and  115* 
for  example,  then  a one  (negative  valence)  was  assigned  to 
the  criteria  category  11  (ease  of  application  criteria 
relating  to  the  Choice-Between-Gambles  technique).  Each 
other  criteria  relating  to  the  Choice-Between-Gambles 
technique  was  assigned  a one,  three,  or  five  valence.  Each 
number  thus  assigned  to  a two-digit  code  was  summed  to  its 
applicable  first  digit  code  (subjective  probability 
technique);  this  was  the  second  level  of  summarization. 

The  sum  of  the  valences  assigned  to  the  two-digit  codes  was 
the  criteria  evaluation  of  the  subjective  probability 
technique.  The  subjective  probability  technique  with  the 


largest  sum  value  was  selected  as  that  technique  which  best 
answered  the  research  question  and  met  the  research  objec- 
tive. 


SUMMARY  OF  ASSUMPTIONS 


1.  Assessment  of  uncertainty  regarding  cost 
estimates  is  subjective. 

2.  Subjective  estimates  of  uncertainty  can  be 
expressed  in  terms  of  probabilities. 

3.  The  criteria  for  evaluating  the  subjective 
probability  techniques  were  mutually  exclusive,  collec- 
tively exhaustive,  and  equally  weighted. 

4.  The  number  of  sources  reviewed  was  sufficient 
to  evaluate  the  subjective  probability  techniques. 

5.  The  research  team's  version  of  content  analysis 
is  valid  and  reliable. 


coders. 


6.  The  researchers  are  reliable  content  analysis 


7.  The  Martin  Cost  Model  is  valid  for  future  cost 


estimation. 


SUMMARY  OF  LIMITATIONS 


Conclusions  of  this  study  are  limited  to  selecting 
the  subjective  probability  technique  which  best  assesses 
the  magnitude  of  uncertainty  in  a weapon  system  acquisition. 


Chapter  4 


FINDINGS  AND  CONCLUS  ^ 

OVERVIEW 

This  chapter  discusses  the  results  of  the  content 
analysis  by  assessment  technique,  and  sets  forth  the  con- 
clusions that  the  research  team  was  able  to  draw  from  the 
content  analysis. 

RESULTS  OF  CONTENT  ANALYSIS 

The  Choice-Between-Gambles 
Technique 

The  evaluation  of  this  technique  resulted  in  a 
score  of  16,  based  on  57  units  of  content  from  nine  sources 
(see  Appendix  B for  the  source  listing  for  each  technique). 

A narrative  summary  of  each  criteria  category  follows. 

Ease  of  application.  This  criteria  category  received  a one 
(negative)  valence,  based  on  a 76.4  percent  relative  fre- 
quency of  occurrence  of  that  valence.  Representative  com- 
ments by  valence  weret 

1.  Negative  (76.4#) — The  Choice-Between-Gambles 
technique  is  difficult  to  administer  (55*41).  The  technique 
runs  into  difficulty  when  an  individual  is  unwilling  to  risk 
any  part  of  his  capital,  is  unwilling  to  gamble  with  his  own 
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money  regardless  of  odds,  or  is  unwilling  to  gamble  on  moral 
grounds  (36:103,127).  The  expert  must  understand  the  con- 
cepts of  probability.  He  may  find  difficulty  in  the  deter- 
mination of  the  highest  or  lowest  value  for  which  he  can 
state  a subjective  probability,  due  to  his  limited  ability 
to  discriminate  between  values  (3*64).  The  technique  also 
requires  extensive  expert  training,  the  application  of  a 
tedious  methodology,  and  highly  skilled  and  experienced 
analysts.  Still  another  difficulty  is  that  decision  makers 
may  rebel  at  the  idea  of  playing  a "game"  to  assess  their 
subjective  probabilities,  and  may  not  seriously  participate 
or  concentrate  on  their  assessment  task  ( 55 « 41 ,47,66) . 

2.  Mixed  (11.8%) — The  technique  assumes  that  the 
expert  has  normative  and  substantive  expertise  (3  04). 

3.  Positive  (11.8%) — The  technique  is  simple  to 
apply  and  results  directly  in  a probability  density  function 
(3:27). 

Adaptability  and  flexibility.  This  criteria  category 
received  a three  (mixed)  valence,  since  two  of  the  four 
units  of  content  were  mixed.  Valence  comments  were: 

1.  Negative  (25%) — The  method  suffers  from  being 
insufficiently  general  ( 73 « 73 ) • 

2.  Mixed  (50%) — The  method  produces  only  discrete 
probability  functions  (3:27).  Providing  that  the  stakes 
are  not  too  large  and  assuming  that  utility  problems  do  not 
arise,  the  betting  situation  can  be  used  for  assessment  of 
subjective  probabilities  (102:23). 


61 


3.  Positive  (25%) --A  rough  check  of  someone  else's 
probability  against  the  standard  betting  odds  is  useful 
(11:336). 

Reliability  and  validity.  This  category  received  a five 
(positive)  score,  based  on  a 39*2  percent  relative  frequency 
of  occurrence.  Representative  comments  were: 

1.  Negative  (30. 4%) — A primary  limitation  of  the 
technique  is  the  expert's  ability  to  respond  when  through 
further  subdivision  the  probability  of  occurrence  of  the 
interval  becomes  small  (303)  • Another  limitation  is  that 
the  method  is  necessarily  inexact;  partly  because  of  the 
diminishing  marginal  utility  of  money,  and  partly  because 
a person  may  have  special  eagerness  or  reluctance  to  bet. 

The  proposal  of  a bet  may  alter  an  individual's  state  of 
opinion  (73:73) » or  may  fail  because  the  human  mind  has 
proven  itself  inept  at  estimating  odds  (36:103).  Also,  if 
the  amount  of  money  being  bet  is  very  small,  the  individual 
may  be  careless  in  judging  the  odds  (36:103). 

2.  Mixed  (30.4%) — The  merits  of  the  method  are 
related  to  the  expert's  ability  to  respond  easily  and 
intelligently,  and  to  the  confidence  that  can  be  placed  in 
the  expert's  response  (87:45).  The  technique  assumes  that 
the  monetary  rewards  are  large  enough  to  motivate  the  expert; 
that  the  expert  has  normative  and  substantive  expertise;  and 
that  the  expert  is  able  to  make  more  rational,  consistent, 
and  correct  judgements  when  presented  with  gambles  than  if 

he  were  asked  to  directly  estimate  probabilities  (3:28,34). 
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3-  Positive  (39.2%) — The  Choice-Between-Garables 
technique  is  fundamentally  sound  (73*. 73).  The  fact  that 
the  technique  is  generally  axiomatically  valid  has  been  its 
main  advantage.  Many  authors  feel  that  this  form  of  ques- 
tioning approach  results  in  a more  realistic  subjective 
density  function  than  a direct  questioning  approach  (3: 2*0. 
The  technique  may  prove  useful  in  vague  situations,  since 
the  consideration  of  betting  situations  helps  to  alleviate 
vagueness  (102:97).  Compared  to  other  techniques,  the  tech- 
nique results  in  a more  valid  density  function  (3:27);  has 
a higher  response  confidence  due  to  the  ease  of  response 
(87:^5);  and  is  more  realistic  since  it  incorporates  risk 
in  its  analysis  (55 :^1). 

Time . In  this  criteria  category,  since  two  units  of  content 
were  negative  and  two  were  positive,  the  valences  were 
averaged  to  give  a score  of  three  (mixed). 

Removal  of  bias.  A score  of  one  (negative)  was  given  to 
this  criteria  category,  since  all  of  the  units  of  content 
were  negative.  Valence  comments  included: 

1.  Negative — The  main  problem  with  the  technique  is 
that  different  individuals  may  interpret  or  react  differ- 
ently to  the  same  probability;  thus,  bias  is  introduced 
into  the  method  according  to  how  the  decision  maker  per- 
ceives the  probability  (55:^6).  The  "probabilities"  of 
the  gamble  can  be  presented  in  several  ways,  and  the  way 
used  will  affect  the  behavior  to  be  expected  from  the 
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appraiser  (74:85).  'i'fte  individual’s  attitudes  toward  risk 
may  also  influence  his  decision  (55:44,47;  74:23.85). 

Miscellaneous . A score  of  three  (mixed)  was  recorded  in 
this  criteria  category.  There  was  one  unit  of  content  in 
each  valence,  thus- the  valences  were  averaged  to  obtain  the 
score  of  three. 

The  Standard  Lottery 
Technique 

The  evaluation  of  the  Standard  Lottery  technique 
resulted  in  a score  of  24,  based  upon  33  units  of  content 
from  nine  sources.  A narrative  summarization  of  the  content 
analysis  by  criteria  category  follows. 

Ease  of  application.  A five  (positive)  valence  was  assigned 
to  this  criteria  category,  based  upon  a 65  percent  relative 
frequency  of  occurrence  in  the  14  units  of  content  coded  in 
this  category.  Comments,  by  valence,  were: 

1.  Negative  (.21%) — The  success  of  the  Standard 
Lottery  technique  is  dependent  upon  the  expert's  familiarity 
with  lottery- type  betting  situations  (3s 26).  A problem  in 
the  technique's  application  is  that  the  expert  is  unable  to 
discriminate  between  values  when  the  probability  of  their 
occurrence  is  small  (54:97;  58:146). 

2.  Mixed  (14%) — The  expert  may  find  it  difficult 
to  determine  the  highest  or  lowest  characteristic  value  for 
which  he  can  state  a subjective  probability  (3:27).  and 
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to  determine  a single  probability  that  makes  him  indifferent 
between  two  lotteries  (102:21). 

3.  Positive  (65$) — The  Standard  Lottery  technique 
is  a simplifying  device  (58:14-5;  81:4-77)  which  is  easy  to 
apply  (3:27.64-).  Since  probability  statements  are  not  made 
directly,  the  expert  with  little  probability  theory  may  be 
more  comfortable  with  this  technique  (3:36,64-;  58 : 5)  • The 
technique  also  lends  itself  to  solicitation  of  responses 
without  having  to  use  the  personal  interview  technique 
(58:14-5-14-6) . 

Adaptability  and  flexibility.  A five  (positive)  valence  was 
assigned  to  this  category,  based  upon  a relative  frequency 
of  occurrence  of  75  percent  in  t*e  four  units  of  content: 

1.  Negative — None  coded. 

2.  Mixed  (25$) — In  situations  where  vagueness  is 
present,  the  standard  lottery  technique  may  prove  useful 
(102:97) . 

3.  Positive  (75$) — The  standard  lottery  technique 
is  applicable  to  virtually  any  situation  involving  uncer- 
tainties (54-.*97).  Businessmen  can  find  a unique  set  of 
weights  which  describe  their  attitudes  in  a more  complex 
situation  by  using  the  standard  lottery  technique  (78:14-; 
79:12).  The  technique  can  also  be  employed  to  check  the 
consistency  of  expert  responses  after  another  technique  was 
used  originally  to  define  the  probabilities  for  the  variables 
of  interest. 
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Reliability  and  validity.  A relative  frequency  of  occur- 
rence of  60  percent  resulted  in  a five  (positive)  valence 
for  this  criteria  category,  with  seven  units  of  content 
coded.  Comments  were: 

1.  Negative  [29%) — There  is  no  response  consistency 
check  inherent  in  the  technique  (3*64;  58:146). 

2.  Mixed  (14$) — The  lottery  technique  should  help 
to  reduce  vagueness,  but  it  is  not  possible  generally  to 
eliminate  it  entirely  (102:21). 

3.  Positive  (57%) — The  internal  consistency  of 
lottery  techniques  can  be  improved  through  the  use  of  the 
lottery  procedure  (2:16).  The  lottery  results  in  a more 
valid  density  function  than  direct  estimation,  providing 

an  improved  process  for  eliciting  subjective  responses  over 
direct  estimation  (3*27.36) . Considering  lotteries  also 
helps  to  combat  vagueness  (102:97)* 

Time . A positive  valence  of  five  was  given  this  criteria 
category,  based  upon  a relative  frequency  of  occurrence  of 
100  percent  in  four  units  of  content.  Category  comments 
included: 

1.  Negative — None  coded. 

2.  Mixed — None  coded. 

3.  Positive  (100$) — The  technique  is  not  time  con- 
suming (3:47,64;  58:146),  and  can  be  performed  without  using 
a time  consuming  interview  (58:145). 
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Removal  of  bias.  A zero  was  assigned  to  this  category,  since 


no  units  of  content  were  coded  to  this  category. 

Miscellaneous.  A five  (positive)  valence  for  this  category 
resulted  from  a 67  percent  relative  frequency  of  occurrence 
in  three  units  of  content.  The  units  of  content  included: 

1.  Negative  (33%) — Comparing  rewards  to  lotteries 
reduces  decision  making  to  the  level  of  the  casino  (81:477). 

2.  Mixed — None  coded. 

3.  Positive  (67#) — The  technique  derives  probability 
density  functions  through  inference  rather  than  direct  ques- 
tioning (3*27)  ; however,  only  discrete  probability  functions 
are  derived  (58:135). 

The  Modified  Churchman-Ackof f 
Techniq ue 

The  evaluation  of  this  technique  resulted  in  a 
score  of  seven,  based  on  14  units  of  content  from  two  sources. 
A narrative  summarization  of  the  content  analysis  by  criteria 
category  follows. 

Ease  of  application.  A one  (negative)  valence  was  assigned 
to  this  criteria  category,  based  upon  a 60  percent  relative 
frequency  of  occurence  in  the  five  units  of  content  coded 
in  this  category.  Units  of  content,  by  valence,  were: 

1.  Negative  (60%) — The  technique  is  not  as  easy 
to  apply  as  other  techniques.  To  apply  it,  one  must  under- 
stand the  concepts  of  probability  (3*64),  and  one  must  use 


another  technique  to  establish  the  endpoints  of  the  distri- 
bution (58:15*0- 


2.  Mixed — None  coded. 

3-  Positive  (4-0$) — The  technique  does  not  require  ’ 

a knowledge  of  probability  theory.  Some  experts  find  the 
technique  easier  to  use  for  some  classes  of  application  than 
other  techniques  (58:153). 

Adaptability  and  flexibility.  A zero  was  assigned  to  this 
category,  since  nc  units  of  content  were  coded  in  this 
category. 


Reliability  and  validity.  A relative  frequency  of  occurrence 
of  71  percent  resulted  in  a five  (positive)  valence  for  this 
area  category,  with  seven  units  of  content  coded.  Comments 
were : 

1.  Negative  ( 29 %) — The  technique  involves  an  un- 
tested approach  (3:*+3;  58:153)- 

2.  Mixed — None  coded. 

3-  Positive  (71%) --The  technique  offers  a system-? 
atic  method  of  checking  the  consistency  of  relative  value 
judgements  made  by  experts,  enhancing  the  validity  of  the 
resulting  probability  distribution  (3:*+3. 6*+;  58:153)- 

Time.  A negative  valence  of  one  was  given  to  this  category, 
based  on  a relative  frequency  of  occurrence  of  100  percent 
in  two  units  of  content.  Category  comments  included: 

1.  Negative  (100%) — The  technique  is  more  time  con- 
suming than  other  techniques  (3:64-;  58:153-15*+)- 
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2.  Mixed — None  coded. 

3*  Positive — None  coded. 

Removal  of  bias.  A zero  was  assigned  to  this  category, 
since  no  units  of  content  were  coded  in  this  area. 

Miscellaneous . A zero  was  assigned  to  this  criteria 
category,  as  it  also  had  no  units  of  content. 

The  Delphi  Technique 

The  evaluation  of  the  Delphi  technique  resulted  in 
a score  of  18,  with  A-97  units  of  content  from  52  sources. 

A narrative  summary  of  the  content  analysis  by  criteria 
category  follows. 

Ease  of  application.  A one  (negative)  valence  was  assigned 
to  this  criteria  category,  based  upon  a 76.2  percent  fre- 
quency of  occurrence  in  the  84-  units  of  content  coded  in 
this  category.  Representative  units  of  content,  by  valence, 
were : 

1.  Negative  (76.2%)- -A  disadvantage  of  the  appli- 
cation of  the  Delphi  technique  is  that  experts  are  required, 
and  that  experts  are  not  easily  found  and  chosen.  Diffi- 
culties arise  in  defining  rules  for  the  selection  and  the 
composition  of  the  panel  of  experts  (12: ^,61;  13:38;  28:4-7^; 
31:^2;  ^8:182;  56:21;  80:31;  85:53»55)»  The  number  of  experts 
must  be  large  enough  to  assure  replicability  of  results 
(80:15);  however,  establishing  the  required  number  of 
reliable  experts  to  assure  replicability  is  difficult  (71:22). 
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When  experts  are  identified  within  the  organization,  the 
problem  becomes  selection  of  a panel  from  among  them;  when 
experts  are  identified  outside  the  organization,  the  problem 
of  expert  identification  is  much  more  difficult.  The  most 
serious  problem  in  finding  a panel  of  experts  is  in  finding 
a panel  who  will  not  only  agree  to  serve,  but  who  will  also 
be  available  for  a full  sequence  of  questionnaires  (63:53). 
Since  there  is  a tendency  for  high  panel  attrition  over  time 
(48:182;  75:20) , experts  may  have  to  be  paid  to  induce  them 
to  participate  (80:31). 

The  Delphi  technique  requires  a greater  degree  of 
expertise  on  the  part  of  the  analyst  than  do  other  tech- 
niques (58: 160).  The  success  of  Delphi  is  highly  dependent 
on  the  skill  of  the  analyst-administrator;  very  few  analysts 
have  experience  in  using  the  technique  (3:62;  80:31). 

Analyst  experience  is  required  in  determining  what  infor- 
mation to  feed  back  to  the  experts  (71:45),  in  preparing 
questionnaires  (80:16),  and  in  knowing  when  to  stop  the 
iterations  (31:26).  If  the  volume  of  responses  overwhelms 
the  analysts,  the  analysts  may  have  difficulty  in  digesting 
and  collating  what  becomes  an  increasingly  formidable  amount 
of  material  (72:342;  80:32).  Responses  are  difficult  to 
aggregate  (58:41),  and  the  aggregation  quite  expensive  if 
a computer  is  used  (33*132).  If  there  is  no  meaningful  way 
to  aggregate  the  panel  responses,  one  would  probably  not  want 
to  use  the  Delphi  technique  (3:62).  Another  Delphi  major 


problem  arises  in  understanding  the  meaning  of  the  distri- 
bution of  estimates  which  have  been  obtained  from  the  experts 
(37:70).  Translating  the  information  obtained  into  imple- 
mentable  action  plans  is  a formidable  task  (13:42).  Still 
another  problem  is  that  busy  executives  often  claim  tnat 
Delphi  forecasts  and  plans  are  outside  the  mainstream  of 
their  activities,  and  thus  assign  Delphi  lower  priority 
than  immediate  problems  (13:42). 

The  Delphi  technique  requires  the  use  of  other  tech- 
niques to  begin  the  iterative  process  (58:160),  and  requires 
a degree  of  quantification  to  be  imposed  upon  subjective 
judgemental  factors.  The  definition  of  this  quantification 
is  a matter  of  principal  concern  to  the  design  team  (90:151). 
In  order  to  use  the  Delphi,  the  principles  of  probability 
must  be  understood  by  the  experts  (3:64). 

There  is  also  a legal  difficulty  in  government 
agencies  using  the  Delphi  technique.  An  Act  of  Congress 
forbids  an  agency  to  conduct  or  sponsor  a study  in  which 
questionnaires  are  circulated  to  more  than  nine  respondents 
without  prior  permission  of  the  Office  of  the  Management  of 
the  Budget  (31:69;  42:55;  80:40). 

2.  Mixed  (12.5$) — The  amount  of  uncertainty  that  can 
be  tolerated  must  be  considered  (37:73)*  Close  cooperation 
is  required  between  the  design  team  and  the  intended  user 
(90:151).  Planners  must  match  the  type  of  output  desired 
with  the  alternative  outputs  which  Delphi  can  produce  (37:73)* 
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3-  Positive  (14.3$) — Delphi  does  not  require  par- 
ticipants to  meet  together,  thus  allowing  a larger  number  of 
consultants  to  be  used  (58:160;  77:218;  80:27;  85:52;  86:18). 
The  appeal  of  the  technique  to  potential  users  is  its 
simplicity,  popularity,  and  directness.  The  appeal  to 
researchers  is  Delphi's  low  cost  and  relatively  painless 
methodology.  A study  can  be  conducted  and  a paper  produced 
with  relatively  small  effort  (75:31.62-63). 

Adaptability  and  flexibility.  A five  (positive)  valence 
was  assigned  to  this  category,  based  on  a relative  frequency 
of  occurrence  of  75*6  percent  in  the  45  units  of  content. 

Some  valence  comments  were: 

1.  Negative  (13-3$) — The  usefulness  of  Delphi  fore- 
casts for  corporations  is  uncertain  (37:68),  since  it  is 
difficult  to  see  what  can  be  done  with  the  "consensus"  when 
at  last  it  is  ascertained  (68:77)-  Delphi  results  provide 

no  means  of  relating  a forecast  to  the  long-range  planning 
of  a company  (15:1 76;  48:188).  However,  Delphi  should  not 
be  considered  for  routine  decision  making  (85:55)- 

2.  Mixed  (11.1$) — The  Delphi  procedure  has  broader 
potential  in  the  analysis  of  uncertainty  than  in  the  estima- 
tion of  a group  probability  density  function  (58:156).  The 
technique  should  not  be  used  to  predict  an  unchanging  future; 
it  should  be  used  to  identify  possible  futures  (83:14). 
Application  of  on-line  computers  and  other  devices  with  the 
technique  is  not  always  easy  and  practical  (80:31)- 
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3 • Positive  (75 -6%) — Delphi  has  been  applied  to  a 
variety  of  problems,  with  versatile  application  to  virtually 
any  area  where  "experts"  can  be  found  (25*6;  58:160;  75:31). 
Procedures  for  the  conduct  of  a Delphi  exercise  are  not 
standard  and  can  be  changed  to  suit  the  situation.  Delphi 
was  listed  as  the  second  most  utilized  technique  for  long- 
range  studies  by  industry  (25:6).  However,  Delphi  is  not 
limited  to  forecasting  problems.  It  can  also  be  used  to 
reach  final  decisions,  to  arrive  at  basic  analysis  inputs, 
to  develop  pro  and  con  arguements  for  political  decisions 
(80:7,27).  to  stimulate  new  ideas  and  alternatives  (71:21; 
80:30),  to  evaluate  alternative  solutions  to  problems  (65:10), 
to  clarify  and  establish  meaningful  criteria  and  objectives, 
to  reduce  the  number  of  contingencies,  and  to  communicate 
before  meetings  are  held  (80:27.30).  The  technique  appears 
to  be  highly  useful  in  generating  preliminary  insights  into 
highly  unstructured  or  undeveloped  subject  areas,  leading  to 
greater  insight  on  the  target  problem  (64:10;  75*-7).  The 
method  can  be  used  to  forecast  subjective  aspects  of  tech- 
nologies as  well  as  to  generate  more  objective  information 
(69:10).  Delphi  may  serve  to  stimulate  the  experts  into 
taking  into  account  considerations  they  might  have  neglected 
(44:6;  46:2;  47:5-6),  and  seems  to  be  an  indispensible 
instrument  for  the  technological  assessment  of  research  and 
development  projects  (28:482).  There  have  already  been  some 
useful  applications  of  Delphi  in  Department  of  Defense 
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problems;  it  is  anticipated  that  in  the  future  Delphi  will 
be  employed  more  frequently  in  Army  systems  analyses  (80s 7-8). 

Reliability  and  validity.  A relative  frequency  of  occurrence 
of  71.3  percent  resulted  in  a one  (negative)  score  for  this 
area  category,  with  129  units  of  content  coded.  Comments 
included: 

1.  Negative  (71*3$) — Little  is  known  about  Delphi's 
validity,  in  the  sense  of  yielding  more  reliable  results  than 
rival  methods  (4-5:5;  52:39;  75:4-3.4-9,67,70;  80:29).  Valida- 
tion of  the  technique  is  needed  (9:4-52;  75:65);  but  may  be 
difficult,  if  not  impossible,  to  obtain  (80:32).  The  validity 
of  the  methodology  and  data  obtained  using  Delphi  have  yet  to 
be  demonstrated  (37:66;  4-8:186,188;  75:24-).  The  evidence 
advanced  in  support  of  Delphi  reliability  is  less  th?.n 
sufficient  (4-8:180;  56:21)  since  little  research  has  been 
done  to  evaluate  the  accuracy  of  Delphi  forecasts  (63:31; 
75:15,70).  Acceptance  of  the  accuracy  of  Delphi  is  a tacit 
assumption  that  has  not  been  clearly  explored,  either  by  the 
rationale  of  Delphi  methodology  or  by  empirical  evidence 
(37:69).  Reliability  of  the  Delphi  is  critically  weakened 
by  an  absence  of  recognized  administrative  standards  to  guide 
implementation  of  the  technique  (4-8:184-;  75:27,66,69,70).  A 
Delphi  prediction  of  a future  event  is  meaningless  if  there 
is  no  measure  of  the  predictive  reliability  of  the  actual 
occurrence  of  the  event  (15:1 75).  Forecasts  generated  by 
Delphi  are  too  ambiguous  to  serve  planners  (4-8:188).  The 


74 


assumption  of  Delphi  that  authentic  experts  exist  for  pre- 
dicting the  extremely  complex  events  common  in  Delphi 
applications  may  be  wishful  thinking;  Delphi  results  often 
represent  informed  opinion,  rather  than  expert  opinion 
(75:34,35)*  The  quality  of  the  Delphi  results  is  only  as 
good  as  the  quality  of  the  experts  (28:474;  33:134).  The 
Delphi  technique  is  an  attitude  polling  technique  dealing 
in  "snap"  judgements  of  ill-defined  issues,  which  produce 
short-lived  attitudes  about  the  future  which  are  quite  differ- 
ent from  systematic  predictions  of  the  future  (75:38). 

2.  Mixed  (6.2%) — The  validity  of  the  Delphi  pro- 
cedure may  be  considered  established  in  an  intuitive  sense 
(45:5).  Increased  employment  of  the  Delphi  in  recent  years 
is  probably  being  conducted  beyond  that  which  is  justified 
by  the  controlled  experimentation  done  to  date . Tne 
assumptions  or  presumptions  that  the  experimental  findings 
apply  to  real  problems  may  be  questionable  (80:28-29). 

Delphi  designers  may  be  accused  of  ignoring  scientific  rigor; 
but  they  are  meeting  a demand  that  cannot  be  met  otherwise, 
and  developing  a body  of  useful  knowledge  on  both  good  and 
bad  design  techniques  (88:183). 

3.  Positive  (22.5%) — Delphi  is  at  least  as  good 
as,  if  not  better  than,  other  long-range  forecasting  tech- 
niques (64:7;  8 5:56).  A series  of  experiments  with  short- 
range  forecasts  showed  that  the  Delphi  method  was  superior 
to  conventional  methods  of  business  forecasting  (63:31). 
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Delphi  procedures  usually  lead  to  increased  accuracy  of 
group  responses  (21:vi;  22:1;  76:435).  Delphi  is  more  accu- 
rate than  direct  confrontation  (85:56),  at-large  experts 
(57:20,30),  face-to-face  committee  procedures  (57:30;  63:52), 
and  unarticulated  intuitive  judgements  (56:21).  Delphi 
is  an  excellent  method  for  gathering  a group  of  opinions 
and  forming  a general  consensus  (65:9)-  The  technique 
provides  a systematic  and  objective  analysis  procedure  for 
a group  of  decision  makers  (24:42;  71:21;  80:30).  Proponents 
of  Delphi  stress  three  attributes  which  contribute  to  authen- 
tic consensus  and  valid  results:  anonymity,  statistical 

response,  and  iterative  polling  with  feedback  (75:^)- 

Time . A negative  valence  of  one  was  given  this  criteria 
category,  based  on  a relative  frequency  of  occurrence  of 
91.7  percent  in  36  units  of  content.  Category  comments 
included: 

1.  Negative  (91*7%) — The  Delphi  process  is  quite 
time-consuming  (3:64;  17:420;  33:64;  37:67;  58:41,60;  67:425; 
71:46;  80:30).  Distribution  of  the  Delphi  questionnaires 
usually  takes  considerable  time  (80:30).  Getting  responses 
to  the  questionnaires  once  they  have  been  sent  is  also  slow 
(1:30;  3108;  72:337;  8 5:55)-  The  extensive  number  of  itera- 
tions required  in  the  Delphi  also  results  in  a heavy  invest- 
ment of  time  (1:33;  89:249).  As  the  number  of  iterations 
increases,  the  amount  of  time  necessary  to  complete  the 
technique  increases  (15:249;  71:22,46).  if  the  time  allotted 
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is  short,  the  experts  may  not  have  time  to  give  Delphi 
questionnaires  adequate  attention  (3:62?  8 5:55)*  If  there 
are  long  periods  of  time  between  sessions,  the  experts  may 
lose  their  train  of  thought  or  rationale  (3:62).  Thus,  if 
there  is  little  time  available , the  Delphi  technique  may 
not  be  a viable  alternative  (3s 62). 

2.  Mixed — None  coded. 

3*  Positive  (8. 3%) --The  Delphi  uses  little  of  the 
experts’  time,  compared  to  other  group  communication  methods 
(3:62;  85:55;  88:183). 

Removal  of  bias.  A five  (positive)  valence  for  this  criteria 
category  resulted  from  a 65*1  percent  relative  frequency  of 
occurrence  in  122  units  of  content.  The  units  of  content 
included: 

1.  Negative  (27. 9%) --Delphi  can  easily  slant  results 
in  the  direction  of  vested  interests,  and  can  produce  manipu- 
lated convergence  of  opinion  reflecting  short-lived  attitudes 
of  very  small  samples  of  unknown  individuals  (75:58,63).  The 
form  of  the  Delphi  questions  may  exert  too  great  an  influence 
on  the  responses  (56:21).  Since  Delphi  requires  public 
opinion  sampling  techniques , it  may  introduce  other  kinds  of 
bias.  The  technique  also  is  vulnerable  to  selection  bias 
in  the  selection  of  experts  (48:182),  and  to  the  individual 
biases  of  analyst-administrators  (4:149).  If  the  Delphi 
questionnaires  are  prepared  by  unqualified  analysts,  the 
responses  to  the  questionnaires  may  be  biased  (3:62)  since 
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each  respondent  would  answer  the  questionnaire  based  upon  a 
different  set  of  assumptions  (31:21).  Experts  have  a 
tendency  to  be  conservative  when  faced  with  the  uncertain- 
ties of  the  future,  which  allows  another  bias  to  be  intro- 
duced (71:22;  80; 32).  There  exists  an  uncontrollable  and 
unknown  expert  halo  effect  in  Delphi,  contributing  to  expert 
oversell  (75: 3^. ^1.69) . If  experts  are  chosen  for 'subjective 
or  reputational  reasons , there  is  a possibility  that  the 
panel  will  be  composed  of  individuals  favorably  disposed 
toward  the  Delphi,  thereby  introducing  bias  (48:182).  The 
artificial  shifting  of  the  group's  view  or  generation  of  an 
artificial  consensus  because  of  dropouts  has  always  been  a 
problem  in  Delphi  exercises  (88; 182).  Delphi  asks  about 
event  stereotypes,  and  experts  respond  with  stereotyped 
estimates  (75:50). 

2.  Mixed  (6.5$) — The  Delphi  procedure  is  not  an 
absolute  guarantee  against  the  degrading  influence  of  the 
"bandwagon"  effect  (63:62).  Nor  is  specious  persuasion 
necessarily  eliminated  by  impersonalizing  the  interaction 
(4:149).  Although  pressure  for  conformity  still  operates 
with  Delphi,  it  is  an  internal  and  individual  pressure 
(31:27).  Individuals  may  misinterpret  the  Delphi  exercise 
to  be  a policy  decision  tool  as  opposed  to  a policy  analysis 
tool;  the  design  team  must  inform  the  respondent  group  of 
the  real  intent  and  purpose  of  the  exercise  (90:154).  If 
Delphi  is  used  to  determine  objectives  of  an  analysis,  it 
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may  be  undesirable  or  embarassing  to  identify  the  real 
objectives  (80:29)-  Delphi  may  serve  to  stimulate  the 
experts  into  taking  into  due  account  considerations  they 
might  inadvertently  have  neglected,  or  dismissed  as  unimpor- 
tant on  first  thought  (22:6;  46:2;  47:5-6). 

3-  Positive  (65. 6%) --Delphi  avoids  the  difficulties 
and  impracticalities  of  group  discussion  by  avoiding  face- 
to-face  confrontation  (3:44;  58:41;  64:2);  there  is  no 
pressure  to  arrive  at  a consensus  (21:4).  Interaction  by 
the  group  members  (experts)  is  handled  in  an  anonymous 
fashion  (21:5;  57:18;  63:20;  85  036),  thus  avoiding  the 
possibility  of  identifying  a specific  opinion  with  a 
particular  person  (57:17.84;  63:20).  This  anonymity  tends 
to  make  the  experts  less  inhibited  in  their  judgements  (3:44- 
45;  12:3;  190;  21:16;  63:20;  65:9;  71:22;  80:11;  75:17). 
and  is  conducive  to  independent  thought  and  a more  gradual 
formulation  of  a considered  opinion  (64:2).  Delphi  is 
designed  to  overcome  the  difficulty  of  an  expert  unwilling 
to  abandon  publicly  expressed  opinions  (4:149;  12:2;  17:419; 
42:120;  4?:5;  63:20;  67:413;  8000;  8501)-  It  is  also 
designed  to  overcome  the  "bandwagon"  effect  of  majority 
opinions  (4:149;  17:419;  42:120;  67:413;  8000;  8501).  and 
to  reduce  specious  persuasion  by  the  »rpert(s)  with  the 
greatest  supposed  authority  (42:120;  470;  57:18;  85OI).  or 
with  a domineering  personality  (69:1;  86:2).  The  experts  can 
evaluate  information  strictly  on  the  information's  merit, 
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without  being  influenced  by  the  personalities  and  status  of 
the  other  contributors  (1308;  83:16;  85:56).  Delphi 
eliminates  personal  antipathy  or  excessive  respect  for  a 
particular  individual's  opinions  and  individual  skill  in 
verbal  debate  (4:149).  Delphi  reduces  conformity  (3:63;  21: 
16;  80:30)  and  irrelevant  or  redundant  material  (80:30). 

The  reduction  in  group  pressure  to  conform  means  that  the 
experts'  responses  are  more  likely  to  reflect  their  true 
opinions  (58:160).  Delphi  assures  that  the  opinion  of  every 
expert  in  the  group  is  represented  in  the  final  responses 
(13:37;  21:16),  and  is  designed  to  call  out  the  kinds  of 
information  that  each  expert  feels  would  enable  him  to 
arrive  at  a confident  answer  to  the  question  (64:2). 


Miscellaneous . A three  (mixed)  valence  was  assigned  to  this 
category,  based  on  a tie  in  relative  frequency  between  the 
one  (negative)  and  five  (positive)  valences  in  the  81  units 
of  content.  The  valences  were  averaged  to  give  the  score  of 
three.  Some  valence  comments  were*. 

1.  Negative  (4 5-7%) — After  several  iterations,  the 
expert  may  be  faced  with  evaluating  projections  in  areas 
outside  his  area  of  expertise  (14:147).  Delphi  gives  the 
expert  who  finds  himself  in  the  majority  a false  sense  of 
confidence  (28:474;  80:32).  The  technique  has  been  criti- 
cized for  its  highly  normative  character  which  often  results 
from  the  efforts  of  the  Delphi  administrator  to  obtain 
consensus  or  convergence . Convergence  is  made  at  the 
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expense  of  the  extreme  opinions  which  in  some  instances  can 
be  interesting  (28:470, 474)  . The  Delphi  is  silent  on  how 
much  iteration  is  enough  (48:184).  Once  a Delphi  exercise 
has  started,  there  is  no  way  to  guarantee  or  control  for  a 
specified  outcome  (90:8).  With  Delphi,  it  is  difficult  to 
take  into  account  the  unexpected,  and  the  technique  dis- 
courages any  attempt  to  reintroduce  broader  concerns  as  the 
studies  advance  (56:21).  Delphi  should  not  be  considered 
when  reliable  data  and  time  are  available  (76:168).  Delphi 
pays  inadequate  attention  to  psychological  values  and  atti- 
tudes toward  the  future , and  as  presently  practiced , is  a 
psychological  projective  technique  for  future  "inkblots" 
(75:29.51) • 

2.  Mixed  (8.6#) — The  Delphi  is  not  just  another 
polling  scheme,  and  polling  practices  should  not  be  trans- 
ferred to  Delphi  practice  without  close  scrutiny  of  their 
applicability  (90:157)-  The  Delphi  technique  has  many 
compelling  qualities  for  lazy  investigators,  so  that  there 
is  a danger  of  a stampede  of  Delphi  "hustlers"  forming 
(77:217). 

3-  Positive  (45.7#) — The  Delphi  technique  creates 
a well-defined  process  that  can  be  described  quantitatively 
(21: vi;  56:21;  63:32).  Delphi  fits  into  a hierarchal 
structure  of  objectives  and  courses  of  action.  The  technique 
assures  that  every  expert  is  represented  in  the  final  response 
(57:18),  allowing  both  the  majority  and  minority  to  have 
their  views  presented  to  the  experts  as  a group  (17:420; 
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63:20-21).  The  technique  forms  a consensus  of  opinion  by 
requiring  justification  for  any  significant  deviation  from 
the  group  average  (85:52).  Extreme  responses  may  be  even 
more  useful  to  management  than  those  of  the  majority  (80:27). 
The  results  of  a Delphi  exercise  are  subject  to  greater 
acceptance  on  the  part  of  the  group  than  are  the  consen- 
suses arrived  at  by  more  direct  forms  of  interaction.  Delphi 
stimulates  thinking  and  involves  management  in  the  forecasting 
process.  This  by  itself  could  well  be  enough  to  justify  its 
use  (37:7^) • 

The  DeGroot  Consensus 
Method 

This  relatively  new  method  received  a score  of  seven 
based  on  six  units  of  content  from  two  sources.  Units  of 
content  are  summarized  below. 

Ease  of  application.  This  category  received  a five  (positive) 
valence,  since  all  of  the  units  of  content  had  a five  valence. 
Comments  included:  the  method  is  intuitively  appealing 

(27:118);  the  simplicity  of  the  procedure  has  much  to 
recommend  it  (4-9:283);  the  method  presents  simple  conditions 
for  determining  whether  it  is  possible  for  a group  to  reach 
a consensus;  and  when  consensus  can  be  reached,  the  weights 
to  be  used  in  the  consensus  which  can  be  explicitly  and 
simply  calculated  (27:118). 
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Adaptability  and  flexibility.  The  one  unit  of  content  in 
this  category  was  coded  one  (negative),  and  stated  that  the 
postulated  form  of  the  revised  distributions  may  be  too 
restrictive  (49:283). 

Reliability  and  validity.  The  one  unit  of  content  in  this 
category  was  coded  one  (negative).  The  comment  was  that  the 
method  had  not  yet  been  empirically  tested  (49:283). 

/ 

Time . This  category  received  a score  of  zero;  no  units  of 
content  were  coded  in  this  category. 

Removal  of  bias . A score  of  zero  was  assigned  for  this 
criteria  category,  since  no  units  of  content  were  coded  in 
the  category. 

Miscellaneous . This  category  also  received  a score  of  zero 
as  no  units  of  content  were  coded  in  the  category. 

The  Direct  Estimation 
Technique 

The  evaluation  of  this  technique  resulted  in  a score 
of  13 , based  upon  20  units  of  content  from  five  sources. 

A narrative  summarization  follows. 

Ease  of  application.  A five  (positive)  valence,  was  assigned 
to  this  criteria  category,  based  on  a relative  frequency  of 
occurrence  of  71  percent  in  the  seven  units  of  content  in  the 
category.  Representative  comments  by  valence  were: 
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1.  Negative  (29%) --The  expert  must  be  knowledgeable 
in  probability  theory  (58:161).  People  often  require  training 
before  they  accept  the  technique  (55:24). 

2.  Mixed — None  coded. 

3.  Positive  (71%) --This  technique  is  simple  in 
application  (3:24;  55:25;  87:45).  Direct  Estimation  does  not 
require  another  individual  to  interview  the  expert  (58:161), 
and  a probability  distribution  can  be  generated  without 
burdensome  calculations  (55:25). 

Adaptability  and  flexibility.  A one  (negative)  valence  was 
assigned  to  this  category,  based  upon  a one  valence  coded  to 
the  one  unit  of  content,  which  stated  that  situations  in 
which  direct  estimation  seemed  least  satisfying  were  those 
concerning  rare,  unusual,  or  unheard  of  instances  (10-B433). 

Reliability  and  validity.  A relative  frequency  of  occurrence 
of  67  percent  resulted  in  a one  (negative)  valence  for  this 
one  category,  with  six  units  of  content  coded.  Comments 
were : 

1.  Negative  (67%) — This  technique  has  little  like- 
lihood of  success  in  most  cases  (3:24);  its  validity  has 
been  questioned  (55:24).  Direct  estimation  is  least  likely 
to  work  because  individuals  do  no  think  directly  in  terms 
of  probabilities  (87:45).  Also,  the  technique  does  not 
check  on  the  consistency  of  the  responses  (58:161). 
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2.  Mixed  (33%) — This  technique  probably  produces 
the  least  valid  responses  (55:162),  but  may  be  a good  tech- 
nique ( 10 : B-433 ) • 

3-  Positive — None  coded. 

Time.  A positive  valence  of  five  was  given  this  criteria 
category,  since  all  of  the  five  units  of  content  were  coded 
five.  Category  comments  included: 

1.  Negative — None  coded. 

2.  Mixed — None  coded. 

3.  Positive  (100%) — Direct  estimation  takes  place 
very  quickly  (10:B-428, B-433;  55:25;  58sl6l). 

Removal  of  bias.  A five  (positive)  valence  was  assigned  to 
this  category,  since  the  one  unit  of  content  was  coded  five. 
The  comment  was  that  there  is  no  possibility  that  the 
interviewer  will  impose  his  bias  on  the  responses  (55:161). 


Miscellaneous . A score  of  zero  was  assigned  to  this  category, 
since  there  were  no  units  of  content  coded  to  the  category. 

CONCLUSIONS 

As  can  be  seen  in  Table  10,  the  Standard  Lottery 
technique  received  the  highest  evaluation  score  of  24,  thus 
answering  the  research  question  and  meeting  the  research 
objective . 

The  researchers  believe  that  these  results  are 


inconclusive,  for  the  following  reasons: 
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1.  The  sample  size,  both  in  number  of  sources  and 
in  number  of  units  of  content,  was  highly  variable;  the 
number  of  sources  ranged  from  two  for  the  DeGroot  Consensus 
technique  to  52  for  the  Delphi  technique,  with  units  of 
content  varying  in  number  from  six  for  the  DeGroot  method 
to  497  for  the  Delphi  technique.  When  using  content  analysis, 
with  resulting  data  of  face  validity,  the  researcher  needs 
as  many  sources  of  data  as  possible  in  order  to  obtain  con- 
clusive results.  This  proved  difficult  to  do  in  the  field 
of  subjective  probability  assessment.  The  only  technique 
that  had  a large  number  of  sources  was  the  Delphi,  with  52 
sources.  The  Standard  Lottery,  with  nine  sources,  was  a 
distant  second  to  the  Delphi.  Thus,  in  order  to  make  valid 
comparisons  between  the  assessment  techniques,  more  sources 
of  data  would  be  required  for  techniques  other  than  the 
Delphi. 

2.  Several  of  the  techniques  had  not  been  empiri- 
cally tested  to  a great  degree.  This  greatly  affected  the 
number  of  sources  available  to  the  research  team.  More 
field  testing  of  the  assessment  techniques  is  required  before 
stronger  conclusions  can  be  drawn  as  to  whether  the  Standard 
Lottery  is  the  technique  best  suited  for  subjective  proba- 
bility assessment. 
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COROLLARY  CONCLUSION  1 

All  of  the  units  of  content  were  summarized  by- 
criteria  category  to  find  which  criteria  categories  were 
mentioned  most  frequently  in  the  critical  literature,  and 
therefore,  which  categories  were  of  the  most  interest  to 
users/researchers.  As  shown  in  Table  11,  reliability  and 
validity  ranked  first,  followed  by  ease  of  application, 
removal  of  bias,  adaptability  and  flexibility,  and  time. 

Thus,  users  are  more  interested  in  the  "goodness"  of  an 
assessment  technique  than  in  range  of  application  or  speed 
of  results. 

COROLLARY  CONCLUSION  2 

Each  criteria  category  was  also  surveyed  to  find 
the  sub-areas  within  each  category  which  were  mentioned  most 
often  in  the  units  of  content;  for  example,  expert  training 
required  was  one  sub-area  in  the  ease  of  application  criteria 
category.  These  sub-areas,  and  their  frequency  of  occurrence, 
are  shown  in  Tables  12  through  15 . 
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Units  of  Content  Summary 
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TABLE  12 

Ease  of  Application  Criteria  Category 


Sub -A  re  as 

Number  of  Units 
of  Content 

Ease  of  Use/Administration 

28 

Simplicity 

1-3 

Expert  Selection/Def  inition/lVIotivation 

28 

Number  of  Experts/Analysts  Required 

8 

Expert  Training/Skills  Required 

23 

Analysts  Skills  Required 

18 

Acceptance  of  Technique  by  Analysts/lSxperts 

4 

Workload/Attrition  of  Experts 

5 

Cost  of  Use 

3 

TABLE  13 

Adaptability  and  Flexibility 
Criteria  Category 


Number  of  Units 


Sub-Areas  of  Content 

Application  to  a Variety  of  Situations  17 

Potential  Usefulness  5 

Comparative  Utilization/Frequency  of 

Employment  2 

Limitations  2 

Flexibility  of  Procedures  2 

Usefulness  of  Specific  Applications  28 


90 


TABLE  14 

Reliability  and  Validity 
Criteria  Category 

Number  of  Units 

Sub-Areas  of  Content 

Method  Validity  52 

Data/Results  Validity  29 

Ease  of  Validation/ISvidence  of  Validity  4 

Accuracy/Precision  of  Results  19 

Vagueness  of  Results/Method/Data  13 

Reliability/fcvidence  of  Reliability  6 

Consistency  of  Responses  9 

Objectivity  of  Results/Data  6 

Goodness  of  Experts/Respondents  13 

Relative  Effectiveness  Compared  to 

Other  Techniques  18 


TABLE  15 

Time  and  Removal  of  Bias 
Criteria  Categories 

Number  of  Units 

Sub-Areas  of  Content 

Time  Required  for  Application  of  Technique  50 

Analyst/lSxpert  Bias  3? 

Method/Procedure  Bias  12 

Conformity/Forced  Consensus  Effect  23 

Status  Bias/Personality  Dominance  Effects  14 
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Chapter  5 

RECOMMENDATIONS  FOR  FURTHER  RESEARCH 
GENERAL 

Vague  areas  still  exist  in  both  the  behavioral  and 
statistical  aspects  of  subjective  probability  assessment. 
Empirical  testing  of  assessment  techniques  has  been 
fragmentary.  Below  are  some  specific  recommendations  for 
further  research  in  order  to  broaden  the  base  of  practical 
knowledge  concerning  subjective  probability  assessment 
techniques. 

RESEARCH  REC  OMMENDAT I ONS 

1.  Follow-on  research  in  the  actual  application  of 
the  different  subjective  probability  assessment  techniques 
would  be  useful  in  comparing  their  effectiveness  and  util- 
ity. Subjective  judgement,  through  the  application  of  the 
techniques,  would  be  made  explicit,  providing  feedback  on 
the  techniques  and  therefore  an  impetus  to  the  validation 
and  improvement  of  the  techniques  in  general.  In  a very 
real  sense,  a "laboratory"  that  is  suitable  for  testing 
these  techniques  is  a weapon  system  source  selection.  The 
techniques  have  little  utility  unless  they  can  be  applied  to 
such  long-term  processes  as  source  selection  which  are 
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characterized  by  the  sequential  accumulation  and  assessment 
of  information.  An  ASD  source  selection  is  recommended  as 
such  a "laboratory."  The  recommended  sequence  of  events  is 
as  follows: 

a.  The  research  team,  in  conjunction  with  the 
USAF  Business  Research  Management  Center  (BRMC)  and  the 
Aeronautical  Systems  Division  (ASD) , would  identify  a weapon 
system  source  selection  completed  within  the  previous  six 
months , and  also  identify  15  to  20  individuals  who  served  as 
evaluators  on  the  associated  Source  Selection  Evaluation 
Boa'-’d  (SSEB) . Permission  would  first  be  obtained  through 
the  BRMC  from  the  Source  Selection  Authority  to  allow  the 
source  selection  to  be  used  as  the  "test  bed,"  and  to  make 
appropriate  source-selection  documents  available  to  the 
research  team.  Permission  would  next  be  obtained,  also 
through  the  BRMC,  from  the  supervisors  of  the  identified 
evaluators  in  order  to  allow  their  participation  in  the 
research  effort  as  subjective  probability  estimators. 

b.  The  researchers  would  ascertain  risk  areas 
identified  during  the  source  selection,  using  the  source- 
selection  documents. 

c.  To  the  extent  possible,  the  research  team 
would  provide  the  e - timators  with  tutoring  in  the  concepts 
of  subjective  probability,  to  insure  that  each  estimator  had 
an  understanding  sufficient  to  apply  the  technique.  Records 
would  be  kept  of  how  much  training  was  required,  for  later 
use  in  evaluating  the  techniques . 
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d.  Using  each  subjective  probability  assess- 


ment technique,  the  research  team  would  elicit  subjective 
probability  distributions  for  each  identified  risk  area. 

Five  to  seven  estimators  would  be  randomly  selected  from  the 
total  estimator  group  for  the  application  of  each  technique. 
For  techniques  in  which  group  consensuses  are  net  reached, 
an  average  of  each  estimator's  subjective  probability  for 
the  associated  risk  area  would  be  used  to  aggregate  the 
estimators'  responses  for  that  risk  area. 

e.  The  resulting  assessment  of  uncertainty  would 
be  quantified,  using  the  Martin  Cost  Mode]  (see  Appendix  E), 
into  the  expected  final  cost  for  the  selected  weapon  system. 

f.  To  ascertain  the  behavioral  aspects  of  each 
subjective  probability  technique,  each  estimator  would,  upon 
completion  of  his  estimation  for  an  individual  technique, 
answer  a questionnaire  containing  the  following  questions: 

(1)  Will  introduction  of  the  subjective 
probability  technique  into  the  source-selection  process  serve 
to  clarify,  to  confuse,  or  to  have  no  noticeable  impact  upon 
the  estimator's  subjective  probabilities? 

(2)  Will  introduction  of  the  technique  in- 
crease, decrease,  or  have  no  noticeable  impact  upon  an  esti- 
mator's confidence  in  the  accuracy  of  his  indicated  subjective 
probabilities? 

(3)  Will  introduction  of  the  technique 
increase,  decrease,  or  have  no  noticeable  impact  upon  the 
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(4)  How  satisfied  would  an  estimator  be 
with  whatever  assessment  structure  emerged  from  introduction 
of  the  technique? 

(5)  To  what  extent  would  the  estimator 
regard  the  clarification,  confidence,  or  satisfaction  obtained 
from  the  technique  as  worth  the  additional  time  and  effort 
(marginal  cost)  in  applying  the  technique? 

(6)  To  what  extent  would  the  estimator 
utilize  the  technique  in  other  decision  situations? 

g.  The  future  cost  estimates  obtained  from  the 
Martin  Cost  Model  would  be  used  to  validate  each  technique, 
when  compared  with  the  actual  final  cost  of  the  weapon  system. 

2.  There  is  a great  need  for  research  in  the  area 
of  how  people  evaluate  or  assess  uncertainty  in  a descriptive 
sense;  that  is,  in  day-to-day  activities,  instead  of  just 
people  performing  under  university  test  conditions.  It  is 
difficult  to  infer  how  good  (or  bad)  decisions  are  made  from 
experiments  with  university  test  subjects.  "Real-world" 
situations  should  be  sought  out  for  research. 

3.  Research  needs  to  be  done  into  what  the  subjec- 
tive assessment  task  means  to  the  assessor  in  terms  of  his 
attitudes  and  motivation  toward  assessment. 

4.  Research  is  needed  into  the  amount  of  training 


that  individuals  with  varying  degrees  of  statistical  sophisti- 
cation need  to  obtain  a working  knowledge  of  subjective 
probability  assessment. 
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5-  Finally,  research  would  be  useful  in  the  area 
of  the  attitudes  of  experienced  decision  makers  toward 
subjective  probability  assessment  and  its  uses. 


APPENDIX  A 


WEAPON  SYSTEM  PROGRAM  UNCERTAINTIES 

National  Objectives  and  Strategies 
Present  Defense  Systems  Capabilities 
Defined  Threat  or  Proposed  Change/innovation 
Current/Future  State  of  Technology 
Fiscal  Information/Available  Resources 
Desired  Date  for  Operational  Capability 
Expected  Operational  Environment 
Mission  Responsibility  Assignment/Harmonization 
Mission  Objectives  and  Priorities 
System  Operational/Functional  Requirements 
Performance  Envelopes/Design  Constraints 
Necessary  Technology  Advance  and  Risk  Assessment 
Estimated  Program  Costs  Schedules/Concurrency 
Program  Approval  and  Budget  Authorization 
Rudimentary  Development  Plane  and  Objectives 
System  Performance/Design  Requirements 

Initial  Specification  Tree  Subsystem  Interface  Definition 

End  Item  Performance/Design  Requirements 

Maintenance  and  Logistics  Plans 

Test  and  Evaluation  Concepts 

Training  and  Personnel  Requirements 

98 

Preceding  pageUilank  vt  77 


Realistic  Program  Costs  and  Schedules 

Program  Management/Development/High  Risk  Areas 

Long  Lead  Parts,  Tooling  and  Facilities 

Applicable  Specifications/fyaivers 

Feasible  Design  Approach  for  End  Items 

Preliminary  Drawings  for  Modules/Units 

Reliability/Maintainability  Budgets  for  End  Items 

Critical  Components/Design  Areas  Identified 

Subsystem  Specifications 

End  Item  Interfaces  Defined 

Preliminary  Operational  Facilities  Criteria 

Test  Facility/Range  or  Support  Agency  Requirements 

Tdentified/Approved  Engineering  Design  Changes 

End  Item  Configuration  and  Acceptance  Requirements 

Detailed  Design  and  Assembly  Drawings 

Circuit  Diagrams,  Mechanical/Packaging  Layouts 

Quality  Assurance  and  Test  Requirements 

Estimated  Production  Rates/Quantities/Deliveries 

Process  Specifications  and  Standards 

Make  or  Buy  Decisions 

Configuration  Control  Plans 

Long  Lead  Parts/Materials/Tooling  Quantities 

Parts  Lists,  Components  Space 

Needed  On-Dock  Delivery  Dates 

Purchase  Authorizations 

Material  Sources  and  Market  Prices 

Permissable  Substitution  Parts  Lists 
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Receiving  and  Inspection  Instructions 

Preliminary  Design  and  Assemble  Drawings 

Shop  Fabrication  Instructions 

Required  Materials  and  Parts 

Test  Objectives,  Environment,  Expected  Results 

Detailed  Test  Plans  and  Procedures 

Test  Facility,  Support  Equipment,  Instrumentation' 

Known  Configuration  of  Test  Hardware 

Test  Measurements,  Data,  Variables,  Parameters 

Report  Documentation  Required 

Production  Line/Material  Handling  Layouts 

Tooling  Design  Jigs  and  Fixtures 

Production  Facilities  and  Factory  Test  Equipment 

Materials  and  Parts  Inventory  On-Hand 

Routing,  Scheduling  and  Dispatch  Orders 

Production  Procedures,  Plans  and  Processes 

Realistic  Cost  and  Delivery  Schedules 

Subcontractor  Conformance  Space 

Inspection  Tolerances 

End  Item  Acceptance  Test  Requirements 

Test  Objectives,  Extreme  Environment  Conditions 

Acceptable  Quantity/Time  Duration  Sample  Sized 

Test  Measurements , Data , Variables , Parameters 

Data  Reduction  and  Analysis  Procedures 

Report  Documentation  Requirements 

Test  Objectives,  Environment  Defined 

Acceptable  Demonstration  Criteria  Per  System  Specifications 


Detailed  Test  Plans  and  Procedures 

Test  Measurements,  Data,  Variables  and  Parameters 

Test  Site,  Support  Equipment,  Instrumentation 

Support  from  Range/Other  Contractors/Agencies 

Production  Hardware  Including  Necessary  Spares 

Other  Required  System  Segments/Elements 

Data  Reduction  and  Analysis  Procedure 

Report  Documentation  Requirements 

Training  Course  Materials 

Required  Training  Equipment  and  Facilities 

Qualified  Instructors 

Field  Requirements  for  Trained  Personnel 
Scheduled  Number  of  Students 

Examination  for  Minimum  Acceptable  Skill  Level 
Percentage  Expected  to  Attain  Achievement  Level 
Shipping  and  Transportation  Plans 
Receival  Inspection  Procedures 
Operation  Facilities  Constructed 
Support  Facilities/Equipment  on  Hand 
Installation,  Assemble,  Check  Out  Procedures 
Equipment  Scheduled  Delivery  Dates 
Realistic  Costs  and  Schedules  to  Completion 
System  Performance  Demonstration  Plans 
Operation  Plans,  Instructions,  and  Manuals 
Maintenance  and  Logistics  Plans 
Personnel  Subsystem  Evaluation  Plans 
Reliability,  Maintainability,  Evaluation  Criteria 


User  Performance  Capability  Evaluation  Criteria 
Required  Data  and  Reports  on  System  Performance 
Data  Reduction/Analysis  Techniques  Responsibility 
System  Acceptance  and  Turnover  Agreement 
Transition  of  Logistic  Support  Responsibility 
Preliminary  Follow-On  Plans 
Recommended  Changes  to  System  Design 
Inputs  for  Next-Generation  System  Concept 
Program-Completion  Objectives  Accomplished 
Human  Engineering 
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PHRASES  USED  TO  CODE  UNITS  OF  CONTENT 
TO  EACH  CRITERIA  CATEGORY 

EASE  OF  APPLICATION 


Acceptance 

Aggregation  of  group  response 

Amount  of  material  experts/analysts  must  process 

Application  of  computers 

Attrition 

Communications  between  experts 
Cost 

Cumbersomeness 

Ease  of  determining  values 

Ease  of  use/Application/technique 

Expert  availability 

Expert  composition 

Expert  inducements 

Expert  selection 

Expert  training  required 

Knowledge  of  probability 

Legal  ramifications 

Necessary  skills  to  administer/interview/measure 

Necessary  skills  to  analyze  results 

Necessary  skills  to  prepare  questionnaires 

Necessity  of  soliciting  cooperation  from  experts 

Necessity  to  meet  at  a common  time,  common  location 

Necessity  of  using  personal  interview 

Number  of  calculations 

Number  of  experts  required 

Simpleness 

Use  in  conjunction  with  another  technique 

Use  of  value  type  questions 

Workload 


ADAPTABILITY  AND  FLEXIBILITY 


Application  to  a variety  of  problems/situations 

Comparative  utilization 

Flexibility 

Frequency  of  employment 
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Limitations 

Potential  usefulness 

Usefulness  of  specific  application 
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RELIABILITY  AND  VALIDITY 


Accuracy- 

Ambiguity 

Application  to  real  world  problems 
Check  on  consistency  of  responses 
Comparative  goodness  of  prediction 
Comparison  of  relative  effectiveness 
Data  validity 
Ease  of  validation 
Establishment  of  reliability 
Goodness  of  experts 
Internal  consistency  of  results 
Knowledge  of  validity 
Likelihood  of  success/working 
Method  validity 
Objectivity 

Objectivity  of  analysts 

Precision 

Reasonableness 

Representativeness  of  experts 
Statistical  significance 
Stereotypeness  of  thinking 
Sufficiency  of  reliability 
Suitability  for  evaluation 
Tested 
Vagueness 

Validity  of  probability  function 


TIME 


Length  of  time 

Length  of  time  between  iterations 
Time  allowed 

Time  comparison  to  other  techniques 
Time  estimated 
Time  factor 
Time  required/consumed 


REMOVAL  OF  BIAS 


Anonymity 

Antipathy  for  individual's  opinions 
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Bandwagon  effect 

Bias  of  questionnaires 

Committments  to  earlier  statements 

Comparison  to  psychological  difficulties  of  groups 

Conformity 

Effects  of  dominant  individual 
Ego  involvement 
Elimination  of  group  conflicts 
Expert  selection  bias 
Face  to  face  confrontation 
Halo  effect 

Individual  analyst/expert  bias 

Influence  of  individual  expert  personalities 

Introduction  of  bias 

Personality  dominance 

Specious  persuasion 

Statue 
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PHRASES  USED  TO  CODE  UNITS  OF  CONTENT 
TO  EACH  VALENCE  CATEGORY 

NEGATIVE 


Ambiguous 
Amorphous 
Can  be  negated  by 
Conducted  beyond  that 
which  is  justified 
Cannot  be  sustained 
Caution 

Could  not  be  used  to 
Critical  step  involves 
Criticized 
Dependent  on 
Difficult 
Difficulties  in 
Disadvantage 
Distortion 
Does  not  allow 
Drawback 
Embarrassing 
Exerts  too  great  an 
influence 
Expensive 
Forbids 

Formidable  task 

Hazard 

Ignores 

Impairs 

Impossible 

Impractical 

Inhibits 

Is  critical 

Lack  of 

Major  problem 

Makes  much  more  complicated 

Misinterprets 

Misunderstand 


Misused 

Must  be  available  for  use 

Must  be  careful 

Must  understand 

No  means  of  relating 

Not  easy 

Not  explicit 

Not  inherent  in 

Not  recommended 

Not  satisfactory 

Not  substitute  for 

Not  sufficient 

Not  viable 

Overstated 

Overwhelms 

Plenty  of  should  be  allowed 

Presents  a problem 

Prohibits 

Questionable 

Rarely  shown 

Requires 

Should  be  at  least 

Should  not  be  considered  for 

Specious 

Stereotyped 

Unable  to 

Uncertain 

Undesirable 

Unproven 

Untenable 

Untested 

Untrustworthy 

Usually  fails 

Vague 
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MIXED 


Is  probably 

May 

May  be 

Not  absolute  guarantee 
against 


Not  always 

Not  an  absolute 

Not  generally  possible 

Not  necessarily 

Potentially  useful 


POSITIVE 


Adjusts  well  to 

Advantage 

Aids 

Applicable 
Appropriate  for 
Assures  that 
Avoids 

Avoids  the  disadvantages 

Avoids  the  possibility 

Can  be  changed 

Can  be  used  to 

Conducive  to 

Does  not  require 

Easier  to  use 

Efficient 

Eliminates 

Encourages 

Excellent 

Fits  into 

Free  from 

Good 


Helps  to 

Less  subject  to 

Lends  itself  to 

Major  attraction  of 

Minimizes 

More  accurate 

Much  easier  to 

Not  limited  to 

Not  skewed  by 

Overcomes 

Provides 

Provides  sounder  basis 

Reduces 

Refines 

Removes 

Superior  to 

True 

Useful 

Uses  little  of 
Versatile 
Well  defined 
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THE  MARTIN  COST  MODEL 

The  Martin  Cost  Model  is  a conceptual  model  for 
negotiated  sole-source  development  contracts  formulated 
by  M.  D.  Martin  in  his  doctoral  work  at  the  University  of 
Oklahoma.  The  model  uses  the  parameters  of  time,  uncer- 
tainty, information,  and  their  interactions , to  illustrate 
the  relationship  between  development  program  costs  and  the 
disorder  present  in  the  program  at  contract  award,  and  to 
predict  the  final  cost  of  a development  program  at  the  time 
of  contract  award. 

Time  is  taken  as  the  underlying  parameter  used  to 
relate  the  other  three  parameters  (61j117).  As  a decision 
maker  looks  farther  ahead  in  time  in  making  decisions, 
uncertainty  about  the  outcome  of  a situation  increases  (6li 
49).  Risk  and  uncertainty  can  be  equated  in  the  decision 
process.  Risk  is  usually  thought  of  in  terms  of  the 
possibility  of  a future  event  occurring,  based  on  an 
objective  probability  distribution.  Uncertainty  is 
generally  defined  as  that  situation  where  no  such  objec- 
tive distribution  exists,  with  little  or  no  useable  infor- 
mation present.  A decision  maker  faced  with  uncertainty, 
where  a decision  is  required,  makes  a subjective  probability 
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assessment  as  to  future  outcomes  based  on  his  experience 
and  intuitive  "feel"  for  the  situation,  thus  treating 
uncertainty  in  the  same  manner  as  risk  (61:37-38)*  Uncer- 
tainty at  a certain  point  in  the  time  continuum  is  inversely 
proportional  to  the  amount  of  information  present  at  that 
point,  Martin  distinguished  between  the  amount  and  the 
usefulness,  or  efficacy,  of  information,  and  stated  that 
the  efficacy  of  information  received  from  a management 
information  system  was  the  most  important  variable  in  the 
information-uncertainty  relationship  (61:117)*  A certain 
cost  to  the  decision  maker  will  be  associated  with  more 
information/informational  efficacy  about  a specific 
situation.  Program  costs  will  also  increase  directly  with 
increases  in  uncertainty  of  program  outcomes  (61:117). 

Martin  uses  the  communication  system  theory  of 
Shannon  and  Weaver  to  relate  the  uncertainty  in  a system 
to  the  entropy,  or  disorder,  present  in  a system  (61:122): 

H - 2 p^  log  p^  i = 1,2,3,  . . . , n 

where  H equals  entropy,  and  p^ , Pg,  . • • » Pn  are  the 
individual  probabilities  of  choice  for  certain  outcomes  of 
the  system.  Thus  the  more  choices  that  are  available  in  a 
system,  the  more  the  disorder,  or  entropy,  present  in  the 
system. 

It  is  possible  to  increase  the  order  in  a closed 
information  system,  thus  decreasing  the  entropy 
(negentropy) . In  an  information  system,  entropy  can  be 
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related  to  informational  efficacy  by  the  equations 
(61:121-22): 


H + IE  = 1;  IE  = 1 - H 


where  IE  equals  informational  efficacy,  or  order,  and  H 
equals  entropy,  or  disorder. 

Martin  made  the  following  assumptions  in  formulating 
the  model  (61:125-26): 

1 . The  theory  is  normative  rather  than  descrip- 
tive. 

2.  The  effective  cost  for  a program  can  be 
represented  by  a ratio  of  target  costs  to  the 
informational  efficacy  of  the  data  in  a closed 
decision-making  system. 

3.  Perfectly  competitive  markets  exist.  Con- 
tracts are  let  on  a competitively  negotiated  basis. 

This  assumption  will  be  relaxed  subsequently. 

4.  Entropy  is  the  measure  of  the  information 
or  disorder  in  a system;  whereas,  negentropy  is  a 
measure  of  the  order  in  a system  and  can  be  equated 
with  informational  efficacy. 

5.  The  informational  efficacy  varies  inversely 
with  the  number  of  choices  or  possible  events  which 
can  occur  as  related  to  the  decision-making  system. 

If  one  course  of  action  seems  almost  certain,  then 
informational  efficacy  is  increased  and  vice  versa. 

6.  Since  contract  price  usually  includes  an 
amount  for  profit,  this  fact  must  be  considered. 

Ecomonic  cost  includes  the  profit  factor,  and  can, 
therefore,  represent  contract  price. 

7.  No  limitations  on  funding  exist.  Programs 
can  be  fully  funded  in  anticipation  of  a possible 
cost  growth. 

Using  these  assumptions,  the  expected  cost  including 
profit,  or  economic  cost  Cg,  can  be  related  to  the  initial 
cost  estimate  Cjthrough  the  equation  (61:126):  ] 


r 
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One  can  see  if  IE  equals  one,  or  certainty  exists  about  the 
contract  outcome,  Cg  equals  Cj,  and  no  error  exists  between 
estimated  and  final  costs.  Thus  the  amount  of  informa- 
tional usefulness  present  in  a weapon  system  development 
program  can  be  used  by  a program  manager  at  contract  award 
time  to  estimate  cost  growth  for  the  program,  or  decide  on 
the  need  for  more  information  before  awarding  the  'contract. 

Martin  demonstrated  the  logical  consistency  of  the 
model  in  his  dissertation  (61:146-47),  and  the  model  was 
used  by  two  AFIT  thesis  teams  with  the  following  limitations 
(38«35-36;  5*66-67): 

1.  The  model  was  developed  for  application  to 
weapon  system  development  programs. 

2.  The  model  is  limited  by  the  quality  of 
information  subsystems  in  the  overall  program  system;  if 
information  systems  cannot  identify  all  possible  alterna- 
tives at  a given  point  in  time,  an  accurate  measure  of 
uncertainty  is  not  possible. 
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